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PATENT 

Attorney Docket No.: 019496-002000 
FUNCTIONAL GENOMICS USING ZINC FINGER PROTEINS 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application is related to USSN 09/229,007, filed January 12, 1999, 

and USSN 09/229,037, filed January 12, 1999, herein both incorporated by reference in 

their entirety. 

STATEMENT REGARDING 
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT 

Not applicable. 

FIELD OF THE INVENTION 
The present invention provides methods of regulating gene expression 
using recombinant zinc finger proteins, for functional genomics and target validation 
applications. 

BACKGROUND OF THE INVENTION 
Determining the function of a gene of interest is important for identifying 
potential genomic targets for drug discovery. Genes associated with a particular function 
or phenotype can then be validated as targets for discovery of therapeutic compounds. 
Historically, the function of a particular gene has been identified by associating 
expression of the gene with a specification function of phenotype in a biological system 
such as a cell or a transgenic animal. 

One known method used to validate the function of a gene is to genetically 
remove the gene from a cell or animal (i.e., create a "knockout") and determine whether 
or not a phenotype (i.e., any change, e.g., morphological, functional, etc., observable by 
an assay) of the cell or animal has changed. This determination depends on whether the 
cell or organism survives without the gene and is not feasible if the gene is required for 
survival. Other genes are subject to counteracting mechanisms that are able to adapt to 
the disappearance of the gene and compensate for its function in other ways. This 
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compensation may be so effective, in fact, that the true function of the deleted gene may 
go unnoticed. The technical process of creating a "knockout" is laborious and requires 
extensive sequence information, thus commanding immense monetary and technical 
resources if undertaken on a genome wide scale. 

In another example, antisense methods of gene regulation and methods 
that rely on targeted ribozymes are highly unpredictable. Another method for 
experimentally determining the function of a newly discovered gene is to clone its cDNA 
into an expression vector driven by a strong promoter and measure the physiological 
consequence of its over-expression in a transfected cell. This method is also labor 
, intensive and does not address the physiological consequences of down-regulation of a 
target gene. Therefore, simple methods allowing the selective over- and under-expression 
of uncharacterized genes would be of great utility to the scientific community. Methods 
that permit the regulation of genes in cell model systems, transgenic animals and 
transgenic plants would find widespread use in academic laboratories, pharmaceutical 
5 companies, genomics companies and in the biotechnology industry. 

An additional use of target validation is in the production of in vivo and in 
vitro assays for drug discovery. Once the gene causing a selected phenotype has been 
identified, cell lines, transgenic animals and transgenic plants could be engineered to 
express a useful protein product or repress a harmful one. These model systems are then 
0 used e.g., with high throughput screening methodology, to identify lead therapeutic 
compounds that regulate expression of the gene of choice, thereby providing a desired 

phenotype, e.g., treatment of disease. 

Methods currently exist in the art, which allow one to alter the expression 
of a given gene, e.g., using ribozymes, antisense technology, small molecule regulators, 
25 over-expression of cDNA clones, and gene-knockouts. As descnbed above, these 
methods have to date proven to be generally insufficient for many applications and 
typically have not demonstrated either high target efficacy or high specificity in vivo. For 
useful experimental results and therapeutic treatments, these characteristics are desired. 

Gene expression is normally controlled by sequence specific DNA binding 
30 proteins called transcription factors. These bind in the general proximity (although 
occasionally at great distances) of the point of transcription initiation of a gene and 
typically include both a DNA binding domain and a regulatory domain. They act to 
influence the efficiency of formation or function of a transcription initiation complex at 
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the promoter. Transcription factors can act in a positive fashion (transaction) or in a 
negative fashion (transrepression). Although transcription factors typically contain a 
regulatory domain, repression can also be achieved by steric hindrance via a DNA 

binding domain alone. 

Transcription factor function can be constitutive (always "on") or 
conditional. Conditional function can be imparted on a transcription factor by a variety of 
means, but the majority of these regulatory mechanisms depend of the sequestenng of the 
factor in the cytoplasm and the inducible release and subsequent nuclear translocation, 
DNA binding and translation (or repression). Examples of transcription factors that 
function this way include progesterone receptors, sterol response element binding 
proteins (SREBPs) and NF-kappa B. There are examples of transcription factors that 
respond to phosphorylation or small molecule ligands by altering their ability to bind their 
cognate DNA recognition sequence (Hou et al, Science 256:1701 (1994); Gossen & 
Bujard, Proc. Natl. Acad. Set U.S.A. 89:5547 (1992); Oligino et al, Gene Ther. 5:491- 
496 (1998); Wang et al, Gene Ther. 4:432-441 (1997); Neering et al. Blood 88:1 147- 
1155 (1996); and Rendahl et al, Nat. Biotechnol. 16:757-761 (1998)). 

Zinc finger proteins ("ZFPs") are proteins that can bind to DNA in a 
sequence-specific manner. Zinc fingers were first identified in the. transcription factor 
TFIIIA from the oocytes of the African clawed toad, Xenopus laevis. Zinc finger proteins 
are widespread in eukaryotic cells. An exemplary motif characterizing one class of these 
proteins (Cys 2 His 2 class) is -Cys-(X) 2 .-Cys-(X) l2 -His-(X) 3 . 5 -His (where X is any amino 
acid). A single finger domain is about 30 amino acids in length and several structural 
studies have demonstrated that it contains an alpha helix containing the two invariant 
histidine residues co-ordinated through zinc with the two cysteines of a single beta turn. 
To date, over 10,000 zinc finger sequences have been identified in several thousand 
known or putative transcription factors. Zinc finger proteins are involved not only in 
DNA-recognition, but also in RNA binding and protein-protein binding. Current 
estimates are that this class of molecules will constitute the products of about 2% of all 
human genes. 

The X-ray crystal structure of Zif268, a three-finger domain from a murine 
transcription factor, has been solved in complex with its cognate DNA-sequence and 
shows that each finger can be superimposed on the next by a periodic rotation and 
translation of the finger along the main DNA axis. The structure suggests that each finger 
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interacts independently with DNA over 3 base-pair intervals, with side-chains at positions 
-1, 2 , 3 and 6 on each recognition helix making contacts with respective DNA triplet sub- 
site. The amino terminus of Zif268 is situated at the 3' end of its DNA recognition 
subsite. Recent results have indicated that some zinc fingers can bind to a fourth base in a 
target segment (Isalan et al, Proc. Natl. Acad. Sci. U.S.A. 94:5617-5621 (1997). The 
fourth base is on the opposite strand from the other three bases recognized by zinc finger 
and complementary to the base immediately 3' of the three base subsite. 

The structure of the Zif268-DNA complex also suggested that the DNA 
sequence specificity of a zinc finger protein might be altered by making amino acid 
substitutions at the four helix positions (-1, 2, 3 and 6) on a zinc finger recognition helix. 
Phage display experiments using zinc finger combinatorial libraries to test this 
observation were published in a series of papers in 1994 (Rebar et al, Science 263:671- 
673 (1994); Jamieson et al., Biochemistry 33:5689-5695 (1994); Choo et al, Proc. Natl. 
Acad. Sci. U.S.A. 91:11163-11167 (1994)). Combinatorial libraries were constructed with 
randomized side-chains in either the first or middle finger of Zif268 and then isolated 
with an altered Zif268 binding site in which the appropriate DNA sub-site was replaced 
by an altered DNA triplet. Correlation between the nature of introduced mutations and 
the resulting alteration in binding specificity gave rise to a partial set of substitution rules 
for rational design of zinc finger proteins with altered binding specificity. Greisman & 
Pabo, Science 275:657-661 (1997) discuss an elaboration of a phage display method in 
which each finger of a zinc finger protein is successively subjected to randomization and 
selection. This paper reported selection of zinc finger proteins for a nuclear hormone 
response element, a p53 target site and a TATA box sequence. 

Recombinant zinc finger proteins have been reported to have the ability to 
regulate gene expression of transiently expressed reporter genes in cultured cells (see, 
e.g., Pomerantz et al, Science 267:93-96 (1995); Liu et al, Proc. Natl Acad. Sci. U.S.A. 
94:5525-5530 1997); and Beerli et al, Proc. Natl Acad. Sci. U.S.A. 95:14628-14633 
(1998)). For example, Pomerantz et al, Science 267:93-96 (1995) report an attempt to 
design a novel DNA binding protein by fusing two fingers from Zif268 with a 
homeodomain from Oct-1. The hybrid protein was then fused with either a 
transcriptional activator or repressor domain for expression as a chimeric protein. The 
chimeric protein was reported to bind a target site representing a hybrid of the subsites of 
its two components. The authors then constructed a reporter vector containing a 
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luciferase gene operably linked to a promoter and a hybrid site for the chimeric DNA 
binding protein in proximity to the promoter. The authors reported that their chimeric 
DNA binding protein could activate or repress expression of the luciferase gene. 

Liu et al, Proc. Natl. Acad. Sci. U.S.A. 94:5525-5530 (1997) report 
5 forming a composite zinc finger protein by using a peptide spacer to link two component 
zinc finger proteins, each having three fingers. The composite protein was then further 
linked to transcriptional activation or repression domains. It was reported that the 
resulting chimeric protein bound to a target site formed from the target segments bound 
by the two component zinc finger proteins. It was further reported that the chimeric zinc 
10 finger protein could activate or repress transcription of a reporter gene when its target site 
wal inserted into a reporter plasmid in proximity of a promoter operably linked to the 
reporter. 

Beerli et al, Proc. Natl. Acad. Sci. U.S.A. 95:14628-14633 (1998) report 
construction of a chimeric six finger zinc finger protein fused to either a KRAB, ERD, or 
15 SID transcriptional repressor domain, or the VP16 or VP64 transcriptional activation 

domain. This chimeric zinc finger protein was designed to recognize an 18 bp target site 
in the 5' untranslated region of the human erbB-2 gene. Using this construct, the authors 
of this study report both activation and repression of a transiently expressed reporter 
luciferase construct linked to the erbB-2 promoter. 
20 in addition, a recombinant zinc finger protein was reported to repress 

expression of an integrated plasmid construct encoding a bcr-abl oncogene (Choo et al. 
Nature 372:642-645 (1994)). The target segment to which the zinc finger proteins bound 
was a nine base sequence GCA GAA GCC chosen to overlap the junction created by a 
specific oncogenic translocation fusing the genes encoding bcr and abl. The intention 
25 was that a zinc finger protein specific to this target site would bind to the oncogene 

without binding to abl or bcr component genes. The authors used phage display to select 
a variant zinc finger protein that bound to this target segment. The variant zinc finger 
protein thus isolated was then reported to repress expression of a stably transfected bcr- 

abl construct in a cell line. 
30 To date, these methods have focused on regulation of either transiently 

expressed, known genes, or on regulation of known exogenous genes that have been 
integrated into the genome. In contrast, specific regulation of a candidate gene or list of 
genes to identify the cause of a selected phenotype has not been demonstrated in the art. 
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SUMMARY OF THE INVENTION 
The present iavention thus proves for the first time methods of 

.dentif^agen^ 

target validation, or functional genomics. f H^tifvint. the 

,„ one aspect, the present inventtcn provides a method of .denting the 
hiologrca, function of a candtdate gene, the method compnstng the steps of (,) se.ec tng 
^e Late gene; (,i) provtdmg a firs, ztnc fmger protetn that btnds to a first targe 

candidate gene and a second ztnc finger protetn that binds to a target stte o a 

second zinc finger protetn contacts the second candidate gene, wheretn the firs, and the 

and (iv) assaying for a seleOed pheno^pe, .hereby tden.ifytng whether or no, 
candidate gene is assorted with the selected pheno.ype. 

in another aspect, the present invention provrdes a method of rden. fy g 
te motcgical taction of a cand.date gene, the method compnstng the s,eps of. W 
the brologt providing a first zinc finger protem that 

identifying a plurality of candidate genes, (u)proviu & 

I d to a firs" target site of a firs, candidate gene; (iii) culturing a firs. ceU under 

btnds to a first * « ^ ^ ^ ^ wherein the 

conditions where the tirst zinc nngc v j. . t . determining 

firs, zinc finger protein modula.es express,on of .he firs, candtdate gene, (») d_ng 

elite gene is associa.ed wrth *. se,ec,ed phenol; and(v) repea„ng s,eps („)-("> 

... to t he method comprising the steps of: (1) 

-\ QPlectine a first candidate gene, (n) proviumg 

of ^ firs, Candida* gene; (ui> cuUu.ng a firs, cel. under condthons w^rs, 
finger pro,ein con,ac,s «he firs, candidal gene, and culhtnng a second ce„ under 
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^.ions where .he second z.nc finger pro.em con.ac,s the firs, candidate gene where.n 
1 firet and ,he second z,nc finger pro.e.ns modula.e express.on of the firs, candrda e 
gene; and „v> assaying for a selected phenol, thereby idenntymg whether or no, the 
fir, cand.date gene is assoc.atedwift the selected phenotype. 

^n another aspec,, ,he presen, .nvent.on provdes a me,hod of .de„,,fy,ng 
the b,o,ogical function of a candrdate gene, the method compris.ng ,he steps of: 0) 
Meeting a first candidate gene; (,i> prov.ding a first z.„c finger protein tha, b.nds to a 
first tar e, s,«e of the firs, cand.date gene; (ii.) culturing a firs.cel, under condrtions 
where the first cand.date ,„c finger protctn contacts the first candidate gen. where.n the 
first zmc finger proteins modu.a.e express.on of the first candidate gene; and (,v) 
^ying for a selec.ed phenotype, thereby idenfifymg whether or no, ,he firs, eand.da.e 

gene is associated with the selected phenotype. athirdzinc 
,„ one embodimen., the method further compnses prov,d.ng a <h,rd zmc 
finger protein that binds to a second target site of the firs, Candida* gene. In one 
Id men,, the me.hod fibber composes prov.de a tiurd ,nc finger pro.em h, h n s 
,o a targe, site of a second cand,da.e gene. In another embodtment, the medrod further 
I^ectingaplurahtyof cand,da.e genes and prov.ding a pl.ua.ity of zmc finger 

proteins mat bind .o a targe, si.e of each candidate gene. 

In one embodiment, the firs, candidate gene .s partially encoded by an 
EST of at leas, about 200 nucleotides in length. In one embodiment, the firs, cand.date 
E e„e and the second gene are both associated with the selected phenotype. In one 
lien, the second gene is a contro, gene. In one embodimen, the rs, and sec^ 
2 are the same ce„, wherem me ce.l composes ,he fir, and second cand.date genes. In 
cne embodimen,, me firs, and .he second candidate genes are endogenou sgene, 

in one embodiment, expression of the candidate genes .s.nh.b..ed by a, 
' leas, about 50%. In one embodimen, expression of the cand,da.e genes is activated by a 

~f o^r.<» pvnression. In one embodiment, tne 
exoression that prevents repression of gene expression. 

rllfionoflpressionisinhibifionofgene ^^^^^ 
In one embodiment, me zinc finger prote.ns are firs.on protems compos ng 
one or more regu,a,ory domains. In one embodimen, Ore regulatory domain is se>ec.ed 
ta me group consisting of a transcriptiona, repressor, a me.hyl transferase, a 
transcriptiona, activator, a histone acetyhransferase, and a hrs.one deacetylase. 
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In one embodiment, the cell is selected from the group consisting of 
animal cell, a plant cell, a bacterial cell, a protozoal cell, a fungal cell, a mammalian cell, 
or a human cell. In one embodiment, the cell comprises less than about 1.5xl0 6 copies of 

each zinc finger protein. 

In one embodiment, the first and second zinc finger proteins are encoded 
by an expression vector comprising a zinc finger protein nucleic acid operably linked to a 
promoter, and wherein the method further comprises the step of first administering the 
expression vector to the cell. In one embodiment, expression of the zinc finger proteins is 
induced by administration of an exogenous agent. In one embodiment, expression of the 
zinc finger proteins is under small molecule control. In one embodiment, expression of 
the first zinc finger protein and expression of the second zinc finger protein are under 
different small molecule control, wherein both the first and the second zinc finger protein 
are fusion proteins comprising a regulatory domain, and wherein the first and the second 
zinc finger proteins are expressed in the same cell. In one embodiment, both the first and 
second zinc finger proteins comprise regulatory domains that are repressors. In one 
embodiment, the first zinc finger protein comprises a regulatory domain that is an 
activator, and the second zinc finger protein comprises a regulatory domain that is a 
repressor. 

In one embodiment, the expression vector is a viral vector. In another 
embodiment, the expression vector is a retroviral expression vector, an adenoviral 
expression vector, or an AAV expression vector. In one embodiment, the zinc finger 
proteins are encoded by a nucleic acid operably linked to an inducible promoter. 

In one embodiment, the target site is upstream of a transcription initiation 
site of the candidate gene. In one embodiment, the target site is downstream of a 
transcription initiation site of the candidate gene. In one embodiment, the target site is 
adjacent to a transcription initiation site of the candidate gene. In another embodiment, 
the target site is adjacent to an RNA polymerase pause site downstream of a transcription 
initiation site of the candidate gene. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows schematic representation of target validation using zinc 

finger proteins to regulate gene expression. 

Figure 2 shows zinc finger protein expression constructs. 
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Flg ure 3 shows lactase reporter constructs for zinc finger protem 

regulatlon „ ^ ^ ^ _ ^ protems ^ iuciferase ^ 

gene activation^ ^ ^ ^ ^ & ^ ^ ^ ^ ^ by _ 
finger proteins. 

DETAILED DESCRIPTION OF THE INVENTION 

used in assays to determine .he phenotypic consequences and function of gene 

p sL The recent advances in analy.ca, ,echn,ues, coup.ed with focused tnass 
Hg efforts have created the opportunity to .denttfy and cnaractenze many - 
I liar argets than were prevtous.y availabie. This new —on about genes and 
I rlctton w,U speed aiong basic b.o.cgtca, understand^ and present many new 
J ,s for therapeuttc tnte„entio, In some cases analyttca, tools have no, 
1 h the generation of new data. An example is provtded by recent advances ,n the 
I— of global dtrferentta, gene express. These methods, typtfled by gene 
"1 rmclays, dtfferentta. cDNA Contng frequencies, subtree hybrrd^on 

regulated in dtfferen, tissues or in response to specific stimul, Increasingly such 
Z 11 being used to expiore biological processes such as, transforma„on, tumor 
:Z Z 1 i nammatory response, neurotica, dtsorders etc. One can now very 
Z S ene ate iong Usts of different* expressed genes that correlate^ a gtven 

u , H^nn^tratine a causative relationship between a 

to monitor differential gene expression. 

However, zinc finger protein technology can be used to rap.dly an lyze 
0 different,! gene expression studies. Engineered zinc finger proteins can be rca .ly t*ed 
l0 diilerentiai g v t9roPt oetiC Very little sequence information is 

to uo or down-regulate any candidate target gene, very 4 

P ^ ific DNA binding domain. This makes the zinc finger 

required to create a gene-specific DMA binding ^fferentiallv 
plin technology ideal for analysis of long lists of poorly characterized differentially 
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expressed genes. One can simply build a zinc finger-based DNA binding domain for 
each candidate gene, create chimeric up and down-regulating artificial transcription 
factors and test the consequence of up or down-regulation on the phenotype under study 
(transformation, response to a cytokine etc.) by switching the candidate genes on or off 

one at a time in a model system. 

Additionally, greater experimental control can be imparted by zinc finger 
proteins than can be achieved by more conventional methods. This is because the 
production and/or function of an engineered zinc finger protein can be placed under small 
molecule control. Examples of this approach are provided by the Tet-On system, the 
ecdysone-regulated system and a system incorporating a chimeric factor including a 
mutant progesterone receptor. These systems are all capable of indirectly imparting small 
molecule control on any candidate gene of interest or any transgene by placing the 
function and/or expression of a zinc finger protein regulator under small molecule 
control. In one embodiment, a cell comprises two zinc finger proteins. The zinc finger 
proteins either target two different candidate genes (i.e., two genes associated with the 
same phenotype), or two different target sites on the same candidate gene. Each zinc 
finger protein also comprises a regulatory domain. Expression of each zinc finger protein 
is under different small molecule control, allowing variations in the degree of activation 

or repression of gene expression. 

The present application therefore provides for the first time methods of 
using zinc finger proteins for identifying a gene or genes associated a selected phenotype, 
e.g., for drug discovery target validation or for functional genomics. The present 
invention provides zinc finger DNA binding proteins that have been engineered to 
specifically recognize genes, with high efficacy. Modulation of gene expression using 
25 zinc finger proteins is used to determine the biological function of a gene, or a gene 
represented by an EST, and to validate the function of potential target genes for drug 
discovery. 

In one embodiment, expression of at least two different genes is regulated, 
using different zinc finger proteins to regulate each gene. One of the genes is a candidate 
gene, and the other gene can be a control gene or a second candidate gene. Cells 
expressing the genes are contacted with zinc finger proteins, or nucleic acids encoding 
zinc finger proteins. Both the genes can be expressed in the same cell, or the genes can 
be each expressed in a different cell. After expression of the first and second genes is 
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modulated by the zinc finger protein, the cells are assayed for changes in a selected 
phenotype, thereby identifying the function of the candidate gene or genes. In another 
embodiment, two zinc finger proteins target the same candidate gene at two different 
target sites. The methods of the invention can be applied both to functional genomics, 
5 which typically refers to identifying genes associated with a particular phenotype, and for 
target validation, which typically refers to identifying genes that are suitable for use in 
drug discovery assays. 

As a result, the zinc finger proteins of the invention can be used to identify 
genes that cause a selected phenotype, both through activation and/or repression of gene 

10 transcription. Zinc finger proteins that bind to a promoter region can be used in the 

present invention, but zinc finger proteins can also regulate gene expression by binding to 
other regions of the gene. Extensive sequence information is therefore not required to 
examine expression of a candidate gene using zinc finger proteins. ESTs therefore can be 
used in the assays of the invention, to determine their biological function. 

15 Furthermore, the zinc finger proteins can also be linked to regulatory 

domains, creating chimeric transcription factors to activate or repress transcription. In 
one embodiment, the methods of regulation use zinc finger proteins wherein the gene 
encoding the zinc finger protein is linked to molecular switches controlled by small 
molecules. The gene expression of the zinc finger proteins is therefore conditional and 

20 can be regulated using small molecules, thereby providing conditional regulation of 
candidate gene expression. 

Such functional genomics assays allow for discovery of novel human and 
mammalian therapeutic applications, including the discovery of novel drugs, for, e.g., 
treatment of genetic diseases, cancer, fungal, protozoal, bacterial, and viral infection, 

25 ischemia, vascular disease, arthritis, immunological disorders, etc. Examples of assay 
systems for changes in phenotype include, e.g., transformation assays, e.g., changes in 
proliferation, anchorage dependence, growth factor dependence, foci formation, growth in 
soft agar, tumor proliferation in nude mice, and tumor vascularization in nude mice; 
apoptosis assays, e.g., DNA laddering and cell death, expression of genes involved in 

30 apoptosis; signal transduction assays, e.g., changes in intracellular calcium, cAMP, 
cGMP, EP3, changes in hormone and neurotransmitter release; receptor assays, e.g., 
estrogen receptor and cell growth; growth factor assays, e.g., EPO, hypoxia and 
erythrocyte colony forming units assays; enzyme product assays, e.g., FAD-2 induced oil 
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desaturation; transcription assays, e.g., reporter gene assays; and protein production 
assays, e.g., VEGF ELISAs. 

In one embodiment, a plurality of candidate genes is provided, and a first 
zinc finger protein is used to modulate expression of one of the candidate genes, while the 
5 expression pattern of the other candidate genes is examined. This step is repeated for 
each of the candidate genes, and changes in the expression patterns are used to determine 
the biological function of the genes. The expression data can then be analyzed to 
reconstruct the order or cascade of genes in a pathway that is associated with a selected 
phenotype. 

10 As described herein, zinc finger proteins can be designed to recognize any 

suitable target site, for regulation of expression of any control or candidate gene of 
choice. Examples of target genes suitable for regulation include VEGF, CCR5, ERa, 
Her2/Neu, Tat, Rev, HBV C, S, X, and P, LDL-R, PEPCK, CYP7, Fibrinogen, ApoB, 
Apo E, Apo(a), renin, NF-kB, I-kB, TNF-a, FAS ligand, amyloid precursor protein, atrial 

15 naturetic factor, ob-leptin, ucp-1, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-12, G-CSF, GM- 
CSF, Epo, PDGF, PAF, p53, Rb, fetal hemoglobin, dystrophin, eutrophin, GDNF, NGF, 
IGF-1, VEGF receptors fit and flk, topoisomerase, telomerase, bcl-2, cyclins, angiostatin, 
IGF, ICAM-1, STATS, c-myc, c-myb, TH, PTI-1, polygalacturonase, EPSP synthase, 
FAD2-1, delta- 12 desaturase, delta-9 desaturase, delta- 15 desaturase, acetyl-CoA 

20 carboxylase, acyl-ACP-thioesterase, ADP-glucose pyrophosphorylase, starch synthase, 
cellulose synthase, sucrose synthase, senescence-associated genes, heavy metal chelators, 
fatty acid hydroperoxide lyase, viral genes, protozoal genes, fungal genes, and bacterial 
genes. In general, suitable genes to be regulated include cytokines, lymphokines, growth 
factors, mitogenic factors, chemotactic factors, onco-active factors, receptors, potassium 

25 channels, G-proteins, signal transduction molecules, and other disease-related genes. 

Candidate genes are selected by methods known to those of skill in the art, 
e.g., by gene expression microarrays, differential cDNA cloning frequencies, subtractive 
hybridization, differential display methods, by cloning ESTs from cells or tissues of 
interest, by identifying genes that are lethal upon knockout, by identifying genes that are 

30 up- or down-regulated in response to a particular developmental or cellular event or 
stimuli; by identifying genes that are up- or down- regulated in certain disease and 
pathogenic states, by identifying mutations and RFLPs, by identifying genes associated 
with regions of chromosomes known to be involved in inherited diseases, by identifying 
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genes that are temporally regulated, e.g., in a pathogenic organism, differences based on 
SNPs, etc. 

A general theme in transcription factor function is that simple binding and, 
in some cases, sufficient proximity to the promoter are all that is generally needed. Exact 
5 positioning relative to the promoter, orientation, and within limits, distance do not matter 
greatly. In some cases enhancers are found positioned large distances away from the 
gene of interest. In addition, for repression of gene expression, often simple steric 
hindrance of transcription initiation is sufficient. These features allow considerable 
flexibility in choosing target sites for zinc finger proteins. The target site recognized by 

10 the zinc finger protein therefore can be any suitable site in the target gene that will allow 
activation or repression of gene expression by a zinc finger protein, optionally linked to a 
regulatory domain. Preferred target sites include regions adjacent to, downstream, or 
upstream of the transcription start site. In addition, target sites that are located in 
enhancer regions, repressor sites, RNA polymerase pause sites, and specific regulatory 

15 sites (e.g., SP-1 sites, hypoxia response elements, nuclear receptor recognition elements, 
p53 binding sites), sites in the cDNA encoding region or in an expressed sequence tag 
(EST) coding region. As described below, typically each finger recognizes 2-4 base 
pairs, with a two finger zinc finger protein binding to a 4 to 7 bp target site, a three finger 
zinc finger protein binding to a 6 to 10 base pair site, and a six finger zinc finger protein 

20 binding to two adjacent target sites, each target site having from 6-10 base pairs. 

Recognition of adjacent target sites by either associated or individual zinc 
finger proteins can be used to produce enhanced binding of the zinc finger proteins, 
resulting in an affinity that is greater than the affinity of the zinc finger proteins when 
individually bound to their target site. In one embodiment, a six finger zinc finger protein 

25 is produced as a fusion protein linked by an amino acid linker, and the resulting zinc 
finger protein recognizes an approximately 18 base pair target site {see, e.g., Liu et aL, 
Proc. Natl. Acad. Set U.S.A. 94:5525-5530 (1997)). An 18 base pair target site is 
expected to provide specificity in the human genome, as a target site of that size should 
occur only once in every 3xl0 10 base pairs, and the size of the human genome is 3.5xl0 9 

30 base pairs {see. e.g., Liu et al. t Proc. Natl. Acad. Set U.S.A. 94:5525-5530 (1997)). In 
another embodiment, the two three-fingered portions of the six fingered zinc finger 
protein are non-covalently associated, through a leucine zipper, a STAT protein N- 
terminal domain, or the FK506 binding protein {see, e.g., O'Shea, Science 254: 539 
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(1991), Barahmand-Poure/a/., Curr. Top. Microbiol. Immunol 211:121-128 (1996); 
Klemm et al., Annu. Rev. Immunol. 16:569-592 (1998); Ho et al, Nature 382:822-826 
(1996)). 

As described herein, two zinc finger proteins are administered to a cell, 
5 recognizing different target genes, e.g., a candidate gene and a control gene, or two 

candidate genes, or two different target sites for the same gene. Optionally, a plurality of 
zinc finger proteins can be administered, which recognize two or more different target 
sites in the same gene. When two candidate genes are examined, both the first and the 
second gene may be required for the phenotype. The candidate genes may be endogenous 

10 genes or exogenous genes. In one embodiment, more than one candidate gene is 
associated with a selected phenotype. 

In another embodiment, the zinc finger protein is linked to at least one or 
more regulatory domains, described below. Preferred regulatory domains include 
transcription factor repressor or activator domains such as KRAB and VP 16, co-repressor 

15 and co-activator domains, DNA methyl transferases, histone acety transferases, histone 
deacetylases, and endonucleases such as Fokl. For repression of gene expression, 
typically the expression of the gene is reduced by about 20% (i.e., 80% of non-zinc finger 
protein modulated expression), more preferably by about 50% (i.e., 50% of non-zinc 
finger protein modulated expression), more preferably by about 75-100% (i.e., 25% to 0% 

20 of non-zinc finger protein modulated expression). For activation of gene expression, 

typically expression is activated by about 1.5 fold (i.e., 150% of non-zinc finger protein 
modulated expression), preferably 2 fold (i.e., 200% of non-zinc finger protein modulated 
expression), more preferably 5-10 fold (i.e., 500-1000% of non-zinc finger protein 
modulated expression), up to at least 100 fold or more. 

25 The expression of engineered zinc finger protein activators and repressors 

can be also controlled by small molecule systems typified by the tet-regulated systems 
and the RU-486 system (see, e.g., Gossen & Bujard, Proc. Natl. Acad. Sci. U.S.A. 
89:5547 (1992); Oligino et al, Gene Ther. 5:491-496 (1998); Wang et al, Gene Ther. 
4:432-441 (1997); Neering etal, Blood 88:1147-1155 (1996); and Rendahl et al., Nat. 

30 Biotechnol. 16:757-761 (1998)). These impart small molecule control on the expression 
of the zinc finger protein activators and repressors and thus impart small molecule control 
on the target gene(s) of interest. This beneficial feature could be used in cell culture 
models, and in transgenic animals and plants. 
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Definitions 

As used herein, the following terms have the meanings ascribed to them 
unless specified otherwise. 
5 A "candidate gene" refers to a cellular, viral, episomal, microbial, 

protozoal , fungal, animal , plant, chloroplastic, or mitochondrial gene. This term also 
refers to a microbial or viral gene that is part of a naturally occurring microbial or viral 
genome in a microbially or virally infected cell. The microbial or viral genome can be 
extrachromosomal or integrated into the host chromosome. This term also encompasses 

10 endogenous and exogenous genes, as well as cellular genes that are identified as ESTs. 
Often, the candidate genes of the invention are those for which the biological function is 
unknown. An assay of choice is used to determine whether or not the gene is associated 
with a selected phenotype upon regulation of candidate gene expression with a zinc finger 
protein. If the biological function is known, typically the candidate gene acts as a control 

15 gene, or is used to determine if one or more additional genes are associated with the same 
phenotype, or is used to determine if the gene participates with other genes in a particular 
phenotype. 

A "selected phenotype" refers to any phenotype, e.g., any observable 
characteristic or functional effect that can be measured in an assay such as changes in cell 

20 growth, proliferation, morphology, enzyme function, signal transduction, expression 
patterns, downstream expression patterns, reporter gene activation, hormone release, 
growth factor release, neurotransmitter release, ligand binding, apoptosis, and product 
formation. Such assays include, e.g., transformation assays, e.g., changes in proliferation, 
anchorage dependence, growth factor dependence, foci formation, growth in soft agar, 

25 tumor proliferation in nude mice, and tumor vascularization in nude mice; apoptosis 
assays, e.g., DNA laddering and cell death, expression of genes involved in apoptosis; 
signal transduction assays, e.g., changes in intracellular calcium, cAMP, cGMP, EP3, 
changes in hormone and neurotransmitter release; receptor assays, e.g., estrogen receptor 
and cell growth; growth factor assays, e.g., EPO, hypoxia and erythrocyte colony forming 

30 units assays; enzyme product assays, e.g., FAD-2 induced oil desaturation; transcription 
assays, e.g., reporter gene assays; and protein production assays, e.g., VEGF ELISAs. 

A candidate gene is "associated with" a selected phenotype if modulation 
of gene expression of the candidate gene causes a change in the selected phenotype. 



) 



16 

The term "zinc finger protein" or "ZFP" refers to a protein having DNA 
binding domains that are stabilized by zinc. The individual DNA binding domains are 
typically referred to as "fingers" A zinc finger protein has least one finger, typically two 
fingers, three fingers, or six fingers. Each finger binds from two to four base pairs of 
DNA, typically three or four base pairs of DNA. A zinc finger protein binds to a nucleic 
acid sequence called a target site or target segment. Each finger typically comprises an 
approximately 30 amino acid, zinc-coordinating, DNA-binding subdornain. An 
exemplary motif characterizing one class of these proteins (Cys 2 His 2 class) is -Cys-(X) 2 -4- 
Cys-(X)i2-His-(X) 3 _ 5 -His (where X is any amino acid). Studies have demonstrated that a 
single zinc finger of this class consists of an alpha helix containing the two invariant 
histidine residues co-ordinated with zinc along with the two cysteine residues of a single 
beta turn (see, e.g., Berg & Shi, Science 271:1081-1085 (1996)). 

A "target site" is the nucleic acid sequence recognized by a zinc finger 
protein. A single target site typically has about four to about ten base pairs. Typically, a 
two-fingered zinc finger protein recognizes a four to seven base pair target site, a three- 
fingered zinc finger protein recognizes a six to ten fease pair target site, and a six fingered 
zinc finger protein recognizes two adjacent nine to ten base pair target sites. 

The term "adjacent target sites" refers to non-overlapping target sites that 
are separated by zero to about 5 base pairs. 

"Kd" refers to the dissociation constant for the compound, i.e., the 
concentration of a compound (e.g., a zinc finger protein) that gives half maximal binding 
of the compound to its target (i.e., half of the compound molecules are bound to the 
target) under given conditions (i.e., when [target] « Kd), as measured using a given assay 
system (see, e.g., U.S. Patent No. 5,789,538). The assay system used to measure the Kd 
should be chosen so that it gives the most accurate measure of the actual Kd of the zinc 
finger protein. Any assay system can be used, as long is it gives an accurate measurement 
of the actual Kd of the zinc finger protein. In one embodiment, the Kd for the zinc finger 
proteins of the invention is measured using an electrophoretic mobility shift assay 
("EMS A"), as described herein. Unless an adjustment is made for zinc finger protein 
purity or activity, the Kd calculations made using the methods described herein may result 
in an underestimate of the true Kd of a given zinc finger protein. Optionally, the Kd of a 
zinc finger protein used to modulate transcription of a candidate gene is less than about 
100 nM, or less than about 75 nM, or less than about 50 nM, or less than about 25 nM. 
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The phrase "adjacent to a transcription initiation site" refers to a target site 
that is within about 50 bases either upstream or downstream of a transcription initiation 
site. "Upstream" of a transcription initiation site refers to a target site that is more than 
about 50 bases 5' of the transcription initiation site. "Downstream" of a transcription 
5 initiation site refers to a target site that is more than about 50 bases 3 ' of the transcription 
initiation site. 

The phrase "RNA polymerase pause site" is described in Uptain et al, 

Annu. Rev. Biochem. 66:117-172 (1997). 

"Administering" an expression vector, nucleic acid, zinc finger protein, or 
10 a delivery vehicle to a cell comprises transducing, transfecting, electroporating, 

translocating, fusing, phagocytosmg, or biolistic methods, etc., i.e., any means by which a 
protein or nucleic acid can be transported across a cell membrane and preferably into the 
nucleus of a cell, including administration of naked DNA. 

A "delivery vehicle" refers to a compound, e.g., a liposome, toxin, or a 
1 5 membrane translocation polypeptide, which is used to administer a zinc finger protein. 
Delivery vehicles can also be used to administer nucleic acids encoding zinc finger 
proteins, e.g., a lipidmucleic acid complex, an expression vector, a virus, and the like. 

The terms "modulating expression" "inhibiting expression" and 
"activating expression" of a gene refer to the ability of a zinc finger protein to activate or 
20 inhibit transcription of a gene. Activation includes prevention of transcriptional 

inhibition (i.e., prevention of repression of gene expression) and inhibition includes 
prevention of transcriptional activation (i.e., prevention of gene activation). 

"Activation of gene expression that prevents repression of gene 
expression" refers to the ability of a zinc finger protein to block or prevent binding of a 

25 repressor molecule. 

"Inhibition of gene expression that prevents gene activation" refers to the 

ability of a zinc finger protein to block or prevent binding of an activator molecule. 

Modulation can be assayed by determining any parameter that is indirectly 
or directly affected by the expression of the target gene. Such parameters include, e.g., 
30 changes in RNA or protein levels, changes in protein activity, changes in product levels, 
changes in downstream gene expression, changes in reporter gene transcription 
(luciferase, CAT, p-galactosidase, p-glucuronidase, GFP {see. e.g., Mistili & Spector, 
Nature Biotechnology 15:961-964 (1997)); changes in signal transduction, 
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phosphorylation and dephosphorylation, receptor-ligand interactions, second messenger 
concentrations (e.g., cGMP, cAMP, EP3, and Ca 2+ ), cell growth, and neovasculanzation, 
etc., as described herein. These assays can be in vitro, in vivo, and ex vivo. Such 
functional effects can be measured by any means known to those skilled in the art, e.g., 
measurement of RNA or protein levels, measurement of RNA stability, identification of 
downstream or reporter gene expression, e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, ligand binding assays; 
changes in intracellular second messengers such as cGMP and inositol triphosphate (IP3); 
changes in intracellular calcium levels; cytokine release, and the like, as described herein. 

To determine the level of gene expression modulation by a zinc finger 
protein, cells contacted with zinc finger proteins are compared to control cells, e.g., 
without the zinc finger protein or with a non-specific zinc finger protein, to examine the 
extent of inhibition or activation. Control samples are assigned a relative gene expression 
activity value of 100%. Modulation/inhibition of gene expression is achieved when the 
gene expression activity value relative to the control is about 80%, preferably 50% (i.e., 
0.5x the activity of the control), more preferably 25%, more preferably 5-0%. 
Modulation/activation of gene expression is achieved when the gene expression activity 
value relative to the control is 110% , more preferably 150% (i.e, 1.5x the activity of the 
control), more preferably 200-500%, more preferably 1000-2000% or more. 

A "transcriptional activator" and a "transcriptional repressor" refer to 
proteins or effector domains of proteins that have the ability to modulate transcription, as 
described above. Such proteins include, e.g., transcription factors and co-factors (e.g., 
KRAB, MAD, ERD, SID, nuclear factor kappa B subunit P 65, early growth response 
factor 1, and nuclear hormone receptors, VP 16, VP64), endonucleases, integrases, 
recombinases, methyltransferases, histone acetyltransferases, histone deacetylases etc. 
Activators and repressors include co-activators and co-repressors {see, e.g., Utley et al, 

Nature 394:498-502 (1998)). 

A "regulatory domain" refers to a protein or a protein domain that has 
transcriptional modulation activity when tethered to a DNA binding domain, i.e., a zinc 
finger protein. Typically, a regulatory domain is covalently or non-covalently linked to a 
zinc finger protein to effect transcription modulation. Alternatively, a zinc finger protein 
can act alone, without a regulatory domain, to effect transcription modulation. 
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The term "heterologous" is a relative term, which when used with 
reference to portions of a nucleic add indicates that the nucleic acid comprises two or 
more subsequences that are not found in the same relationship to each other in nature. 
For instance, a nucleic acid that is recombinantly produced typically has two or more 
sequences from unrelated genes synthetically arranged to make a new functional nucleic 
acid, e.g., a promoter from one source and a coding region from another source. The two 
nucleic acids are thus heterologous to each other in this context. When added to a cell, 
the recombinant nucleic acids would also be heterologous to the endogenous genes of the 
cell. Thus, in a chromosome, a heterologous nucleic acid would include an non-native 
(non-naturally occurring) nucleic acid that has integrated into the chromosome, or a non- 
native (non-naturally occurring) extrachromosomal nucleic acid. 

Similarly, a heterologous protein indicates that the protein comprises two 
or more subsequences that are not found in the same relationship to each other in nature 
(e.g., a "fusion protein," where the two subsequences are encoded by a single nucleic acid 
sequence). See, e.g., Ausubel, supra, for an introduction to recombinant techniques. 

The term "recombinant" when used with reference, e.g., to a cell, or 
nucleic acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has 
been modified by the introduction of a heterologous nucleic acid or protein or the 
alteration of a native nucleic acid or protein, or that the cell is derived from a cell so 
modified. Thus, for example, recombinant cells express genes that are not found within 
the native (naturally occurring) form of the cell or express a second copy of a native gene 
that is otherwise normally or abnormally expressed, under expressed or not expressed at 
all. 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription. As used herein, a promoter typically includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of certain RNA 
polymerase II type promoters, a TATA element, enhancer, CCAAT box, SP-1 site, etc. 
As used herein, a promoter also optionally includes distal enhancer or repressor elements, 
which can be located as much as several thousand base pairs from the start site of 
transcription. The promoters often have an element that is responsive to transactivation 
by a DNA-binding moiety such as a polypeptide, e.g., a nuclear receptor, Gal4, the lac 
repressor and the like. 
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A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that 
is active under certain environmental or developmental conditions. 

The term "operably linked" refers to a functional linkage between a 
5 nucleic acid expression control sequence (such as a promoter, or array of transcription 
factor binding sites) and a second nucleic acid sequence, wherein the expression control 
sequence directs transcription of the nucleic acid corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated 
recombinants or synthetically, with a series of specified nucleic acid elements that 
0 permit transcription of a particular nucleic acid in a host cell, and optionally, integration 
or replication of the expression vector in a host cell. The expression vector can be part of 
a plasmid, virus, or nucleic acid fragment, of viral or non-viral ongin. Typically, the 
expression vector includes an "expression cassette," which comprises a nucleic acid to be 
transcribed operably linked to a promoter. The term expression vector also encompasses 
1 5 naked DNA operably linked to a promoter. 

By "host cell" is meant a cell that contains a zinc finger protein or an 
expression vector or nucleic acid encoding a zinc finger protein. The host cell typically 
supports the replication and/or expression of the expression vector, Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, fungal, protozoal, 
higher plant, insect, or amphibian cells, or mammalian cells such as CHO, HeLa, 293, 
COS-1, and the like, e.g., cultured cells (in vitro), explants and primary cultures («, vara 

and ex vivo), and cells in vivo. 

"Nucleic acid" refers to deoxyribonucleotides or ribonucleotides and 
polymers thereof in either single- or double-stranded form. The term encompasses 
nucleic acids contaimng known nucleotide analogs or modified backbone residues or 
link ages, which are synthetic, naturally occurring, and non-naturally occurring, winch 
have similar binding properties as the reference nucleic acid. Examples of such analogs 
include, without limitation, phosphorothioates, phosphoramidates, methyl phosphona.es, 
chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). 
30 Unless otherwise indicated, a particular nucleic acid sequence also 

implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon 
substitutions) and complementary sequences, as well as the sequence explicitly indicated. 
Specifically, degenerate codon substitutions may be achieved by generating sequences ,n 
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which the third position of one or more selected (or all) codons is substituted with mixed- 
base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); 
Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); Rossolini et al., Mol. Cell. Probes 
8:91-98 (1994)). The term nucleic acid is used interchangeably with gene, cDNA, 
mRNA, oligonucleotide, and polynucleotide. 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 
herein to refer to a polymer of amino acid residues. The terms also apply to amino acid 
polymers in which one or more amino acid residue is an artificial chemical mimetic of a 
corresponding naturally occumng amino acid, as well as to naturally occurring amino 
acid polymers and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino 
acids, as well as amino acid analogs and amino acid mimetics that function in a manner 
similar to the naturally occurring amino acids. Naturally occurring amino acids are those 
encoded by the genetic code, as well as those amino acids that are later modified, e.g., 
hydroxyproline, y-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to 
compounds that have the same basic chemical structure as a naturally occurring amino 
acid, i.e., an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and 
an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl 
sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide 
backbones, but retain the same basic chemical structure as a naturally occurring amino 
acid. Amino acid mimetics refers to chemical compounds that have a structure that is 
different from the general chemical structure of an amino acid, but that functions in a 
manner similar to a naturally occurring amino acid. 

Amino acids may be referred to herein by either their commonly known 
three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB 
Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by 
their commonly accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic 
acid sequences. With respect to particular nucleic acid sequences, conservatively 
modified variants refers to those nucleic acids which encode identical or essentially 
identical amino acid sequences, or where the nucleic acid does not encode an amino acid 
sequence, to essentially identical sequences. Because of the degeneracy of the genetic 
code, a large number of functionally identical nucleic acids encode any given protein. 
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For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. 
Thus, at every position where an alanine is specified by a codon, the codon can be altered 
to any of the corresponding codons described without altering the encoded polypeptide. 
Such nucleic acid variations are "silent variations," which are one species of 
conservatively modified variations. Every nucleic acid sequence herein which encodes a 
polypeptide also describes every possible silent variation of the nucleic acid. One of skill 
will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the 
only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) 
can be modified to yield a functionally identical molecule. Accordingly, each silent 
variation of a nucleic acid which encodes a polypeptide is implicit in each described 
sequence. 

As to amino acid sequences, one of skill will recognize that individual 
substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 
sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 
Conservative substitution tables providing functionally similar amino acids are well 
known in the art. Such conservatively modified variants are in addition to and do not 
exclude polymorphic variants, interspecies homologs, and alleles of the invention. 

The following eight groups each contain amino acids that are conservative 
substitutions for one another: 

1) Alanine (A), Glycine (G); 

2) Aspartic acid (D), Glutamic acid (E); 

3) Asparagine (N), Glutamine (Q); 

4) Arginine (R), Lysine (K); 

5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 

7) Serine (S), Threonine (T); and 

8) Cysteine (C), Methionine (M) 
(see, e.g., Creighton, Proteins (1984)). 
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Design of zinc finger proteins 

The zinc finger proteins of the invention are engineered to recognize a 
selected target site in the candidate gene of choice. Typically, a backbone from any 
suitable Cys 2 His 2 zinc finger protein, such as SP-1, SP-1C, or ZIF268, is used as the 
scaffold for the engineered zinc finger protein {see, e.g., Jacobs, EMBOJ. 1 1 :4507 
(1992); Desjarlais & Berg, Proc. Natl. Acad. Sci. U.S.A. 90:2256-2260 (1993)). A 
number of methods can then be used to design and select a zinc finger protein with high 
affinity for its target (e.g., preferably with a K<j of less than about 25 nM). As described 
above, a zinc finger protein can be designed or selected to bind to any suitable target site 
in the target candidate gene, with high affinity. Co-pending patent application USSN 
09/229,007, filed January 12, 1999 (herein incorporated by reference), comprehensively 
describes methods for design, construction, and expression of zinc finger proteins for 

selected target sites. 

Any suitable method known in the art can be used to design and construct 
nucleic acids encoding zinc finger proteins, e.g., phage display, random mutagenesis, 
combinatorial libraries, computer/rational design, affinity selection, PCR, cloning from 
cDNA or genomic libraries, synthetic construction and the like, (see, e.g., U.S. Pat. No. 
5,786,538; Wu et al, Proc. Natl. Acad. Sci. U.S.A. 92:344-348 (1995); Jamieson et al, 
Biochemistry 33:5689-5695 (1994); Rebar & Pabo, Science 263:671-673 (1994); Choo & 
Klug, Proc. Natl. Acad. Sci. USA. 91:11 163-1 1 167 (1994); Choo & Klug, Proc. Natl. 
Acad. Sci. U.S.A. 91: 11168-11172 (1994); Desjarlais & Berg, Proc. Natl. Acad. Sci. 
U.S.A. 90:2256-2260 (1993); Desjarlais & Berg, Proc. Natl. Acad. Sci. U.S.A. 89:7345- 
7349 (1992); Pomerantz et al, Science 267:93-96 (1995); Pomerantz et al, Proc. Natl. 
Acad. Sci. U.S.A. 92:9752-9756 (1995); and Liu et al, Proc. Natl Acad. Sci. U.S.A. 
94:5525-5530 (1997); Greisman & Pabo, Science 275:657-661 (1997); Desjarlais & Berg, 
Proc. Natl. Acad. Sci. U.S.A. 91:11-99-11103 (1994)). 

In a preferred embodiment, copending application USSN 09/229,007, filed 
January 12, 1999 provides methods that select a target gene, and identify a target site 
within the gene containing one to six (or more) D-able sites (see definition below). Using 
these methods, a zinc finger protein can then be synthesized that binds to the preselected 
site. These methods of target site selection are premised, in part, on the recognition that 
the presence of one or more D-able sites in a target segment confers the potential for 
higher binding affinity in a zinc finger protein selected or designed to bind to that site 
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relative to zinc finger proteins that bind to target segments lacking D-able sites. 
Experimental evidence supporting this insight is provided in Examples 2-9 of copending 
application USSN 09/229,007, filed January 12, 1999. 

A D-able site or subsite is a region of a target site that allows an 
appropriately designed single zinc finger to bind to four bases rather than three of the 
target site. Such a zinc finger binds to a triplet of bases on one strand of a double- 
stranded target segment (target strand) and a fourth base on the other strand {see Figure 2 
of copending application USSN 09/229,007, filed January 12, 1999. Binding of a single 
zinc finger to a four base target segment imposes constraints both on the sequence of the 
target strand and on the amino acid sequence of the zinc finger. The target site within the 
target strand should include the "D-able" site motif 5' NNGK 3', in which N and K are 
conventional IUP AC-TUB ambiguity codes. A zinc finger for binding to such a site 
should include an arginine residue at position -1 and an aspartic acid, (or less preferably a 
glutamic acid) at position +2. The arginine residues at position -1 interacts with the G 
residue in the D-able site. The aspartic acid (or glutamic acid) residue at position +2 of 
the zinc finger interacts with the opposite strand base complementary to the K base in the 
D-able site. It is the interaction between aspartic acid (symbol D) and the opposite strand 
base (fourth base) that confers the name D-able site. As is apparent from the D-able site 
formula, there are two subtypes of D-able sites: 5' NNGG 3' and 5' NNGT 3'. For the 
former site, the aspartic acid or glutamic acid at position +2 of a zinc finger interacts with 
a C in the opposite strand to the D-able site. In the latter site, the aspartic acid or 
glutamic acid at position +2 of a zinc finger interacts with an A in the opposite strand to 
the D-able site. In general, NNGG is preferred over NNGT. 

In the design of a zinc finger protein with three fingers, a target site should 
be selected in which at least one finger of the protein, and optionally, two or all three 
fingers have the potential to bind a D-able site. Such can be achieved by selecting a 
target site from within a larger target gene having the formula 5'-NNx aNy bNzc-3', 
wherein 



each of the sets (x, a), (y, b) and (z, c) is either (N, N) or (G, K); 
at least one of (x, a), (y, b) and (z, c) is (G, K). and 
N and K are IUPAC-IUB ambiguity codes 
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In other words, at least one of the three sets (x, a), (y, b) and (z, c) is the 
set (G, K), meaning that the first position of the set is G and the second position is G or T. 
Those of the three sets (if any) which are not (G, K) are (N, N), meaning that the first 
position of the set can be occupied by any nucleotide and the second position of the set 
can be occupied by any nucleotide. As an example, the set (x, a) can be (G, K) and the 

sets (y, b) and (z, c) can both be (N, N). 

In the formula 5'-NNx aNy bNzc-3', the triplets of NNx aNy and bNzc 
represent the triplets of bases on the target strand bound by the three fingers in a zinc 
finger protein. If only one of x, y and z is a G, and this G is followed by a K, the target 
site includes a single D-able subsite. For example, if only x is G, and a is K, the site reads 
5'-NNG KNy bNzc-3' with the D-able subsite highlighted. If both x and y but not z are 
G, and a and b are K, then the target site has two overlapping D-able subsites as follows: 
5 '-NNG KNG KNz c-3\ with one such site being represented in bold and the other in 
italics. If all three of x, y and z are G and a, b, and c are K, then the target segment 
includes three D-able subsites, as follows 5'NNG KNGJQ$GK3\ the D-able subsites 
being represented by bold, italics and underline. 

These methods thus work by selecting a target gene, and systematically 
searching within the possible subsequences of the gene for target sites conforming to the 
formula 5'-NNx aNy bNzc-3', as described above. In some such methods, every possible 
subsequence of 10 contiguous bases on either strand of a potential target gene is evaluated 
to determine whether it conforms to the above formula, and, if so, how many D-able sites 
are present. Typically, such a comparison is performed by computer, and a list of target 
sites conforming to the formula are output. Optionally, such target sites can be output in 
different subsets according to how many D-able sites are present. 

In a variation, the methods of the invention identify first and second target 
segments, each independently conforming to the above formula. The two target segments 
in such methods are constrained to be adjacent or proximate (i.e., within about 0-5 bases) 
of each other in the target gene. The strategy underlying selection of proximate target 
segments is to allow the design of a zinc finger protein formed by linkage of two 
component zinc finger proteins specific for the first and second target segments 
respectively. These principles can be extended to select target sites to be bound by zinc 
finger proteins with any number of component fingers. For example, a suitable target site 
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for a ntne finger protetn would have three component segments, each conforming to the 

above formula. 

The target sites identif.ed by the above methods can be subject to further 
evaluation by other cri.ena or can be used directly for destgn or selection (if needed) and 
production of a zinc finger protein spectfic for such a site. A former crt.ena for 
eva,ua„ng potential target s„es ts their proximity to particular regions within a gene. If a 
zinc finger protem is to be used to repress a cellular gene on its own (i.e., wtthou. l.nktng 
the zinc finger protein ,0 a repressing moiety), then the optima, location appears to be a, 
or within 50 bp upstream or downstream of the site of transition inrtiation, to mterfere 
with the formation of the transcription complex (Krm & Pabo, J. Bio,. CHe m . 272.29795- 
296800 (1997)) or compete for an essential enhancer bmding protein. If, however, a zrnc 
finger protein is fused to a functiona, domatn such as the KRAB repressor domain or the 
VP,6 acttvator domain, the location of the bmding site is considerably more flextble and 
canbeouts.de known regulatory regions. For example, a KRAB domain canrepress 
rranscription at a promoter up to at leas, 3 kbp from where KRAB ,s bound (Margolrn e 
a, Prcc Nad. Acad. Sci. U.S.A. 91:4509-450 (1994)). Thus, target sites can be selected 
mat do no. necessarily include or overlap segments of demonstrable orotogtcal 
sigrnficance wtth targe, genes, such as regulatory sequences. Other cri.ena for further 
evahrating targe. segment include the prior availability of zinc finger prote.ns btndm to 
such segments or related segments, and/or ease of desrgning new zinc finger protetns 

bind a given target segment. 

After a targe, segment has been selected, a zinc finger protem that bmds to 

the segment can be provided by a varicy of approaches. Tta staples, of approaches ts to 
provide a precharacterized zrnc finger protein from an existing collection Ota. ,s already 
to own to bind .o me target site. However, in many ms.ances, such zinc finger protetns 
do no, exist. An alternative approach can also be used to des.gn new z,nc finger protems, 
which uses me information in a database of existing zinc finger protetns and the, 
respective bindmg affimties. A further approach is to design a zrnc finger protem based 
on substitution rules as discussed above. A stiH tether alternative ts to select a ztnc 
J finger pro«=in with specific* for a given targe, by an empirical process such as phage 
display, in some such methods, each component finger of a zrnc finger pro.em .s 
designed or selected independently of other component fingers. For example, each finger 
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be obtained from a different preexisting zinc finger protein or each finger can be 
subject to separate randomization and selection. 

Once a zinc finger protein has been selected, designed, or otherwise 
provided to a given target segment, the zinc finger protein or the DNA encoding it are 
synthesized. Exemplary methods for synthesizing and expressing DNA encoding zinc 
finger proteins are described below. The zinc finger protein or a polynucleotide encoding 
it can then be used for modulation of expression, or analysis of the target gene containing 
the target site to which the zinc finger protein binds. 

Expression and purification of zinc finger proteins 

Zinc finger protein polypeptides and nucleic acids can be made using 
routine techniques in the field of recombinant genetics. Basic texts disclosing the general 
methods of use in this invention include Sambrook et ai, Molecular Cloning, A 
Laboratory Manual (2nd ed. 1989); Kriegler, Gene Transfer and Expression: A 
Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al, 
eds., 1994)). In addition, essentially any nucleic acid can be custom ordered from any of 
a variety of commercial sources. Similarly, peptides and antibodies can be custom 
ordered from any of a variety of commercial sources. 

Two alternative methods are typically used to create the coding sequences 
required to express newly designed DNA-binding peptides. One protocol is a PCR-based 
assembly procedure that utilizes six overlapping oligonucleotides {see Figure 1 of 
copending patent application USSN 09/229,037). Three oligonucleotides correspond to 
"universal" sequences that encode portions of the DNA-binding domain between the 
recognition helices. These oligonucleotides remain constant for all zinc finger constructs. 
The other three "specific" oligonucleotides are designed to encode the recognition 
helices. These oligonucleotides contain substitutions primarily at positions -1, 2, 3 and 6 
on the recognition helices making them specific for each of the different DNA-binding 
domains. 

The PCR synthesis is carried out in two steps. First, a double stranded 
DNA template is created by combining the six oligonucleotides (three universal, three 
specific) in a four cycle PCR reaction with a low temperature annealing step, thereby 
annealing the oligonucleotides to form a DNA "scaffold." The gaps in the scaffold are 
filled in by high-fidelity thermostable polymerase, the combination of Taq and Pfu 
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polymerases also suffices. In the second phase of construction, the zinc finger template is 
amplified by external pnmers designed to incorporate restriction sites at either end for 
cloning into a shuttle vector or directly into an expression vector. 

An alternative method of cloning the newly designed DNA-binding 
proteins relies on annealing complementary oligonucleotides encoding the specific 
regions of the desired zinc finger protein. This particular application requires that the 
oligonucleotides be phosphorylated prior to the final ligation step. This is usually 
performed before setting up the annealing reactions, but kinasing can also occur post- 
annealing. In brief, the" "universal" oligonucleotides encoding the constant regions of the 
proteins are annealed with their complementary oligonucleotides. Additionally, the 
"specific" oligonucleotides encoding the finger recognition helices are annealed with their 
respective complementary oligonucleotides. These complementary oligos are designed to 
fill in the region which was previously filled in by polymerase in the protocol described 
above. The complementary oligos to the common oligos 1 and finger 3 are engineered to 
leave overhanging sequences specific for the restriction sites used in cloning into the 
vector of choice. The second assembly protocol differs from the initial protocol in the 
following aspects: the "scaffold" encoding the newly designed zinc finger protein is 
composed entirely of synthetic DNA thereby eliminating the polymerase fill-in step, 
additionally the fragment to be cloned into the vector does not require amplification. 
Lastly, the design of leaving sequence-specific overhangs eliminates the need for 
restriction enzyme digests of the inserting fragment. 

The resulting fragment encoding the newly designed zinc finger protein is 
Hgated into an expression vector. Expression vectors that are commonly utilized include, 
but are not limited to, a modified P MAL-c2 bacterial expression vector (New England 
BioLabs, "NEB") or a eukaryotic expression vector, pcDNA (Promega). 

Any suitable method of protein purification known to those of skill in the 
art can be used to purify zinc finger proteins of the invention {see Ausubel, supra, 
Sambrook, supra). In addition, any suitable host can be used, e.g., bacterial cells, insect 
cells, yeast cells, mammalian cells, and the like. 

In one embodiment, expression of the zinc finger protein fused to a 
maltose binding protein (MBP-ZFP) in bacterial strain JM109 allows for straightforward 
purification through an amylose column (NEB). High expression levels of the zinc finger 
chimeric protein can be obtained by induction with IPTG since the MBP-ZFP fusion m 
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the pMal-c2 expression plasm* is under the control of the IPTG inducible tac promoter 
(NEB). Bacteria containing the MBP-ZFP fusion plasmids are inoculated in to 2x YT 
medium containing lOuM ZnCl 2> 0.02% glucose, plus 50 ug/ml ampicillin and shaken at 
3 7 o C . At mid-exponential growth IPTG is added to'0.3 mM and the cultures are allowed 
to shake. After 3 hours the bacteria are harvested by centrifugation, disrupted by 
sonicadon, and then insoluble material is removed by centrifugation. The MBP-ZFP 
proteins are captured on an amylose-bound resin, washed extensively with buffer 
containing 20 mM Tris-HCl (pH 7.5), 200 mM NaCl, 5 mM DTT and 50 uM ZnCl 2 , then 
eluted with maltose in essentially the same buffer (purification is based on a standard 
protocol from NEB). Punned proteins are quantitated and stored for biochemical 
analysis. 

The biochemical properties of the purified proteins, e.g., K<, can be 
characterized by any suitable assay. In one embodiment, K, is characterized via 
electrophoretic mobility shift assays ("EMS A") (Buratowski & Chodosh, in Current 
Protocols in Molecular Biology pp. 12.2.1-12.2.7 (Ausubel ed., 1996); see also U.S. 
Patent No 5 789,538, USSN 09/229,007, filed January 12, 1999, herein incorporated by 
reference). Affinity is measured by titrating purified protein against a low fixed amount 
of labeled double-stranded oligonucleotide target. The target comprises the natural 
binding site sequence (9 or 1 8 bp) flanked by the 3 bp found in the natural sequence. 
External to the binding site plus flanking sequence is a constant sequence. The annealed 
oligonucleotide targets possess a 1 bp 5' overhang which allows for efficient labeling of 
the target with T4 phage polynucleotide kinase. For the assay the target is added at a 
concentration of 40 nM or lower (the actual concentration is kept at least 10-fold lower 
than the lowest protein dilution) and the reaction is allowed to equilibrate for at least 45 
min In addition the reaction mixture also contains 10 mM Tris (pH 7.5), 100 mM KC1, 1 
mM MgCl 2 , 0.1 mM ZnCl 2 , 5 mM DTT, 10% glycerol, 0.02% BSA (poly (dldC) or 
(dAdT) (Pharmacia) can also added at 10-100 ug/ul). 

The equilibrated reactions are loaded onto a 10% polyacrylamide gel, 
which has been pre-run for 45 min in Tris/glycine buffer. Bound and unbound labeled 
, target is resolved with electrophoresis at 150 V (alternatively, 10-20% gradient Tns-HCl 
gels, containing a 4% polyacrylamide stacker, can be used). The dried gels are visualized 
by autoradiography or phosphoroimaging and the apparent K, is determined by 
calculating the protein concentration that gives half-maximal binding. 
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Similar assays can also include determining active fractions in the protein 
preparations. Active fractions are determined by stoichiometric gel shifts where proteins 
are titrated against a high concentration of target DNA. Titrations are done at 100, 50, 
and 25% of target (usually at micromolar levels). 
5 In another embodiment, phage display libraries can be used to select zinc 

finger proteins with high affinity to the selected target site. This method differs 
fundamentally from direct design in that it involves the generation of diverse libranes of 
mutagenized zinc finger proteins, followed by the isolation of proteins with desired DNA- 
binding properties using affinity selection methods. To use this method, the expenmenter 

10 typically proceeds as follows. 

First, a gene for a zinc finger protein is mutagenized to introduce diversity 
in to regions important for binding specificity and/or affinity. In a typical application, this 
is accomplished via randomization of a single finger at positions -1, +2, + 3, and +6, and 
perhaps accessory positions such as +1, +5, +8, or +10. 
1 5 Next, the mutagenized gene is cloned into a phage or phagemid vector as a 

fusion with, eg., gene III of filamentous phage, which encodes the coat protein pill. The 
zinc finger gene is inserted between segments of gene III encoding the membrane export 
signal peptide and the remainder of pin, so that the zinc finger protein is expressed as an 
amino-terminal fusion with pill in the mature, processed protein. When using phagemid 
20 vectors, the mutagenized zinc finger gene may also be fused to a truncated version of 

gene III encoding, minimally, the C-terminal region required for assembly of pin into the 
phage particle. 

The resultant vector library is transformed into E coli and used to produce 
filamentous phage which express variant zinc finger proteins on their surface as fusions 

25 with the coat protein pill (if a phagemid vector is used, then the tins step requires 

superinfection with helper phage). The phage library is then incubated with target DNA 
site and affinity selection methods are used to isolate phage which bind target with high 
affinity from bulk phage. Typically, the DNA target is immobilized on a solid support, 
which is then washed under conditions sufficient to remove all but the tightest binding 

30 phage. After washing, any phage remaining on the support are recovered via elution 
under conditions which totally disrupt zinc finger-DNA binding. 

Recovered phage are used to infect fresh E. coli, which is then amplified 
and used to produce a new batch of phage particles. The binding and recovery steps are 
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then repeated as many times as is necessary to sufficiently ennch the phage pool for tight 
binders such that these may be identified using sequencing and/or screening methods. 

Regulatory domains 

The zinc finger proteins of the invention can optionally be associated with 
regulatory domains for modulation of gene expression. The zinc finger protein can be 
covalently or non-covalently associated with one or more regulatory domains, 
alternatively two or more regulatory domains, with the two or more domains being two 
copies of the same domain, or two different domains. The regulatory domains can be 
covalently linked to the zinc finger protein, e.g., via an amino acid linker, as part of a 
fusion protein. The zinc finger proteins can also be associated with a regulatory domain 
via a non-covalent dimerization domain, e.g., a leucine zipper, a STAT protein N terminal 
domain, or an FK506 binding protein (see, e.g., O'Shea, Science 254: 539 (1991), 
Barahmand-Pour et al., Cur, Top. Microbiol. Immunol. 211:121-128 (1996); Klemm et 
al, Annu. Rev. Immunol. 16:569-592 (1998); Klemm et al, Annu. Rev. Immunol. 16:569- 
592 (1998); Ho et al., Nature 382:822-826 (1996); and Pomeranz et al., Biochem. 37:965 
(1998)). The regulatory domain can be associated with the zinc finger protein at any 
suitable position, including the C- or N-terminus of the zinc finger protein. 

Common regulatory domains for addition to the zinc finger protein 
include, e.g., effector domains from transcription factors (activators, repressors, co- 
activators, co-repressors), silencers, nuclear hormone receptors, oncogene transcription 
factors (e.g., myc, jun, fos, myb, max, mad, rel, ets, bcl, myb, mos family members etc.); 
DNA repair enzymes and their associated factors and modifiers; DNA rearrangement 
enzymes and their associated factors and modifiers; chromatin associated proteins and 
their modifiers (e.g., kinases, acetylases and deacetylases); and DNA modifying enzymes 
(e.g., methyltransferases, topoisomerases, helicases, ligases, kinases, phosphatases, 
polymerases, endonucleases) and their associated factors and modifiers. 

Transcription factor polypeptides from which one can obtain a regulatory 
domain include those that are involved in regulated and basal transcription. Such 
polypeptides include transcription factors, their effector domains, coactivators, silencers, 
nuclear hormone receptors (see, e.g., Goodrich et al., Cell 84:825-30 (1996) for a review 
of proteins and nucleic acid elements involved in transcription; transcription factors in 
general are reviewed in Barnes & Adcock, Clin. Exp. Allergy 25 Suppl. 2:46-9 (1995) and 
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Roeder, Methods Enzymol. 273:165-71 (1996)). Databases dedicated to transcnption 
factors are known (see, e.g., Science 269:630 (1995)). Nuclear hormone receptor 
transcription factors are described in, for example, Rosen et at, J. Med. Chem. 38:4855- 
74 (1995). The C/EBP family of transcription factors are reviewed in Wedel et al, 
Immunobiology 193:171-85 (1995). Coactivators and co-repressors that mediate 
transcription regulation by nuclear hormone receptors are reviewed in, for example, 
Meier Eur J. Endocrinol. 134(2):158-9 (1996); Kaiser et al, Trends Biochem. Set. 
21-342-5 (1996); and Utley et al. Nature 394:498-502 (1998)). GATA transcnption 
factors which are involved in regulation of hemopoiesis, are described in, for example, 
Simon' Nat. Gene, 11:9-1 1 (1995); Weiss et al, E, P . Hematol. 23:99-107. TATA box 
binding protein (TBP) and its associated TAF polypeptides (which include TAF30, 
TAF55, TAF80, TAF1 10, TAF150, and TAF250) are described in Goodnch & Tjian, 
Curr. Opin. Cell Biol. 6:403-9 (1994) and Hurley, Curr. Opin. Struct. Biol 6:69-75 
(1996) The STAT family of transcription factors are reviewed in, for example, 
Barahmand-Pour et al, Cur, Top. Microbiol Immunol 21 1:121-8 (1996). Transection 
factors involved in disease are reviewed in Aso et al,J. Clin. In.es, 97:1561-9 (1996). 

In one embodiment, the KRAB repression domain from the human KOX-1 
protein is used as a transcriptional repressor (Thiesen et al, Ne W Biologist 2:363-374 
(1990)" Margolin et al, Proc. Natl. Acad. Sci. U.S.A. 91:4509-4513 (1994); Pengue et al, 
Nucl Acids Res. 22:2908-2914 (1994); Witzgall et al, Proc. Natl. Acad. Set. U.S.A. 
91-4514-4518 (1994)). In another embodiment, KAP-1, a KRAB co-repressor, is used 
with KRAB (Friedman et al. Genes Dev. 10:2067-2078 (1996)). Alternatively, KAP-1 
can be used alone with a zinc finger protein. Other preferred transcnption factors and 
transcription factor domains that act as transcriptional repressors include MAD (see, e.g., 
Sommer et al, J. Biol. Chem. 273:6632-6642 (1998); Gupta et al. Oncogene 16:1149- 
1159 (1998); QuenetaL. Oncogene 16:967-977 (1998); Larsson et al. Oncogene 
15-737-748 (1997); Laherty et al. Cell 89:349-356 (1997); and Cultraro et al, Mol Cell. 
Biol. 17-2353-2359 (19977)); FKHR (forkhead in rhapdosarcoma gene; Ginsberg et al. 
Cancer Res 15:3542-3546 (1998); Epstein**/., Mol Cell Biol. 18:4118-4130 (1998)); 
, EGR-1 (early growth response gene product-1; Yan et al, Proc. Natl Acad. Set. U.S.A. 
95 8298-8303 (1998); and Liu et al. Cancer Gene The, 5:3-28 (1998)); the ets2 
repressor factor repressor domain (ERD; Sgouras et al, EMBO J. 14:4781-4793 (1995)); 
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and the MAD smSIN3 interaction domain (SID; Ayer et al. Mot. Cell Biol. 16:5772- 
5781 (1996)). 

In one embodiment, the HSV VP16 activation domam is used as a 
transcriptional activator (see. , g „ Hagmann e, al. J. V.rol. 71:5952-5962 (1997)). Other 
preferred transcription factors that could supply activation domains include the VP64 
activation domain (Seipel et al. EMBO J. 1 1 :4961 -4968 (1996)); nuclear hormone 
receptors (see. e.g., Torchra e, al.. Curr. Op.n. Cell. Biol. 10:373-383 (.998)); the p65 
subunit of nuclear fac.orkappaB (B.tko * Bank,,. Virol. 72:5610-5618 (.998) and 
Doyle & Hunt, Neurorepori 8:2937-2942 (1997)); and EGR-1 (early growth response 
geneproduct-l; Van,, a/., Proe. Nail. Acad. Sci. U.S.A. 95:8298-8303 (1998); and Liu « 

„/ Cancer Gene Ther. 5:3-28 (1998)). 

Kinases, phosphatases, and other proteins that modify polypeptides 
involved in gene regulation are also usefirl as regulatory domains for zinc finger proteins. 
Such modifiers are often involved in switching on or off transcription mediated by, for 
example, hormones. Kinases involved in transection regulation are reviewed m Dav.s, 
Moi Reprod. Dev. 42:459-67 (1995), Jackson et al.Adv. Second Messenger 
Pnospnoprotein Res. 28:279-86 (1993), and Boulikas, Crit. Re,. Eukaryot. Gene E x p, 
5 1-77 (1995), while phosphatases are reviewed in, for example, Schonthal & Semtn, 
Cancer Biol. 6:239-48 (1995). Nuclear tyrosine kinases are described in Wang, Trends 

Biochem. Sci. 19:373-6 (1994). 

As described, useful domains can also be obtained from the gene products 

of oncogenes (e.g., myc, jun, fos, myb, max, mad, rel, ets, bcl, myb, mos family 
members) and then associated factors and modifiers. Oncogenes are described ,n or 
example, Cooper, Oncogenes, The Jones andBartlett Series in Biology (2 ed., 1995 
The ets transcription factors are reviewed in Waslylk * al., Eu, J. Blocnen,. 211:7-18 
(1993) and Crepieux et al, Crit. Rev. Oncog. 5:615-38 (1994). Myc oncogenes are 
reviewed in, for example, Ryan etal, BtocHent. J. 314:713-21 (1996). The jun and fos 
transcription factors are described in, for example, The Fos and Jun Families of 
Transcription Factors (Angel & Herrlich, eds. 1994). The max oncogene is reviewed in 
, Hurlin et a,., Cold Spring Haro. Syntp. Quant. Biol. 59,09-16. The myb gene family ts 
reviewed in Kanei-Ishii et al., Curr. Top. UicroHol. Intntunol. 2U:89-98 (1996). The 
mos family is reviewed in Yew el al, Curr. Opin. Genet. De, 3:19-25 (1993). 
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Zinc finger protems can include regulatory domains obtained from DNA 
repair enzymes and their associated factors and modifiers. DNA repair systems are 
reviewed ,n, for example, Vos, Cur,. 0 P ,n. Cell Bio,. 4:385-95 (1992); Sancar, Ann. Re, 
Gene, 29:69-105 (.995); Lehmann, Genet. Eng. 17:1-19 (1995); and Wood, Ann. Re,. 
Btocnen, 65:135-67 (.996). DNA rearrangement enzymes and their assocated factors 
and modifiers can also be used as regulatory domains (see. e. g .. Gangloff e: al, 
E^enentia 50:261-9 (1994); Sadowski, FASEB J. 7:760-7 (1993)). 

Similarly, regula.ory domams can be denved from DNA modtfymg 
enzymes (e.g., DNA methyltransferases, topoisomerases, hehcases, ligases, kmases, 
phosphatases, polymerases) and their assorted factors and modrfiers. He.icases are 
reviewed in Matson e, a,., Bioessays, 16:13-22 (1994). and methyltransferases are 
descnbed in Cheng, Cur, Opi, Struct. Bio,. 5:4-10 (1995). Chromatin assocated 
proteins and their mooters (e.g., kmases, amylases and deacety.ases), sue as h^one 
deacety.ase (Wolffe, Scence 272:37.-2 (1996)) are also use*, as doma.ns for 
the zinc finger pro.etn of ehece. .n one preferred embodtment, the regu.a,ory domam ,s 
a DNA methy. transferase tha, acts as a transcriptrona. repressor (see e.g Van en 
Wyngaert e, a,., FEBS Leu. 426:283-289 (.998); F.ynn e, a,.. J. Mol B.o,. 279..0.-..6 
(.998)- Okano e, a,.. Nucleic Acids Res. 26:2536-2540 (.998); and Zardo * Catafa J. 
B,o, CXem 273, 6517-16520(1998)). ,n another preferred embodtment, endonucleases 
such as Fokl are used as transcript^, repressors, whrch ac, via gene cleavage (see, e.g., 
WO95/09233; and PCT/US94/01201). 

Factors mat control chromatin and DNA structure, movement and 
.ocahzahon and their associated factors and modifiers; factors dertved from rmcrobes 
(e g prokaryotes, eukaryo.es and virus) and factors tha, associate w„h or mod.fy them 
L'a.so be used to obtatn chimenc proteins. In one embodrment, recombinases an 
m tegrases are used as regu.atory domams. ta one embodiment, 
,s used as a transcriptional acfivator (see, e.g., *. * Scotto, Mo, CeU. 

, cc a- 779-371 372 (1996); Taunton et al, Science 272.4U8-4H 
4384 (1998); Wo.ffe,Sc, OT c e 272.371 372 ( 9 , 

(1996); and Hassig et al. Proc. Natl. Acad. Sc. U.S.A. 95.35 19 I 
0 embodiment, histone deacety.ase is used as a transanal repressor ^see,,g^ 
Scotto Mo,. Ce„. Biol. ,8:4377-4384 (1998); Syntichaki ft Thrreos, J. M O- 
273-24414-24419 (1998); Sakagucht et a,.. Genes Dev. 12:2831-2841 (.998); and 
Martinez et al. J. Biol Cnent. 273:23781-23785 (1998)). 
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Linker demons between po.ypeptide domains, e.g., between two zinc 
■ k r een a zinc finger protein and a regulatory domain, can be mcluded. 

IW. 5525-5530 (19 )) ^ ^ ^ are used to 

to link two zinc finger proteins. In anomer 

. ■ rr,BB rPomerantz ef a/. 1995, supra), (G 4 S)„ (Kim et 
link two zinc finger proteins: GGRR(Pomer rrRRGGGS . LR QRDGERP; 

- - - - ~s canl — 

LRQ T GS 3^— lodeljbotb^-bindingsitesandtbe 

zrs^. * Be,, - - - - ----- 

<L), I**, AW Aoai. Sci. U.S.A. 91:1.099-1 1.03 (.994) or by phage disp.ay 

me,h0dS ' ln other embodiment a chemica. linker is used to connect sy—ly or 

n~ <?iich flexible linkers are known to persons 
recombinantly produced domain sequences. Such ilexiDi 

recombinan y P nn w e thvlene glycol) linkers are available ftom 

, „f in the art For example, polyethylene , 

) of skill m art. ^ haye amlde 

shearwater Polymers, Inc. Huntsville, Aiaoam* 

„ In addition to regulatory domains, often the zinc finger pro em s 

„ , onoro.ein such as maltose binding protein ("MBP"), glutathiones 
expressed as a fusion protein such of purification, 

transferase (GST), hexahistidine, c-myc, and the FLAG epitope, tor 
transferase t „„: tnriM cellular and subcellular localization, 

monitoring expression, or monitoring cellular 

r -.rleic acids encoding line finger protein 
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e.g., plasm.ds, or shuttle vectors, or eu k aryo,,c vectors such .nsect vectors, fo sto e o 
m aniHa,,on ^ nucle.c acid encodtng ,nc finger protem or products o f pro, e n 
eZottc vector such as v,ral vectors (e.g., adenov.ral vectors, vector e c.) for 

express.on of zinc finger pro.e.ns and opt.onally regulat.on of gene express.on. The 
nuclerc acd encoding a ztnc finger protem can then be adm.nts.ered to a p ant ce„, 
amma, ce„, a mammal.an ce.l or a human ceil, a fungal ceii, a bac.ena, ce„, or 
protozoa! ceU. ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

1S typically subcloned tnto an expresston vector that contatns a promoter to dtrec, 

descrtbed, e.g., m Sambroo* , a,, MoU^r Con,, A U^ry «^d. 

f ™a FrnrPision • A Laboratory Manual (1990), ana 
1 QRQY Krieeler Gene Transfer and Expression. 

1989), Knegier, lt Bacterial expression 

Cum** Prtfocofa «n Afofecwtar Bio/ogy (Ausubel * al, eds., ) 

■ ^ ,;nr finaer orotein are available in, e.g., E. coh, Bacillus sp., 
systems for expressing the zinc finger protein 

1 S,— ffalva - Gene 22,29-235 (1983,). Ktts for such express™ s^ms 
are commercially avatlable. Eu k aryofic expresston systems for mammal.^ cells, yeast, 
and insect cells are well known in the art and are also commercially available. 

The promoter used to dtrec, expresston of a zinc finger protetn nucletc acd 
depends on the particular application. For example, a strong constitute promoter ts 
, I cally used f r expresston and purification of ztnc finger protetn. In contrast when a 
r finger protetn ts administered in *» for gene regulation, etther a const.mt.ve or an 
Icib e p omoter is used, depending on the parttcular use of the zmc fmger prote m. 

e, hypoxia response elements, G,4 response elements, lac ^"™^ m 
5 L small molecule control systems such as te.-regula.ed sys.ems and the RU-486 sys em 
ZZ Gosscn * B.ard, Pro, W Sc, U.S.A. 89:5547 (1992, Ol.gtno - , 
{see, e.g., w 4-432-441 (1997); Neenng et al, 

Gene Tne, 5:491-496 (1998); Wang e, a,.. Gene TUer. 4.432 

Blood 8 8:n47-U55 (.996,; and Rendah, e, al. Na, Bio.ecHnol ";""«^» 
to addition .o the promoter, the expression vector typ.cally contams 
,• „„i. or expression cassette that contains all the additional elements reqmred 

1 sequence encoding .he zinc finger protein, and signais reqmred, e.g., for efiicen. 
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polyadenylafon ofthe transcnpt, .ranscr.pt.ona, term.nat.on, nbosome b.nd.ng sues, or 
***** termination. Addit.onal elements ofthe cassette may inciude, e.g., enhancers, 
and heterologous spliced intronic signals. 

The particular express.on vector used to transport the genefic ,nfcrma..on 
l„,o the cell ,s selected with regard to the intended use ofthe zinc finger protein, e.g., 
express,o„ ,n plants, annuals, bacteria, mngus, protozoa etc. (see expression vectors 
descnbed below and in the Example section). Standard bacterral expression vectors 
mclude plasmids such as P BR322 based plasm.ds, pSKP, P ET23D, and commerc.a y 
avanable fus.on expression systems such as GST ,d LacZ. A preferred hrsron prote.n . 
the maltose binding protein, "MBP." Such firsion proteins are used for punf.cat.on of 
zmc finger protem. Epitope tags can a,so be added to recomb.nan, protein, to prov.de 
convenient methods of isolation, for monitormg expression, and for mon.tor.ng cellular 
and subcellular localization, e.g., c-myc or FLAG. 

Express.cn vectors containing regulatory elements from eukaryot.c v,ruses 
are often used in eukaryot.c express.on vectors, e.g., SV40 vectors, papilloma v.nrs 
vectors, and vectors derived from Eps,e,n-Barr virus. Other exemplary eukaryotK 
vectors include pMSG, pAV009/A + , P MTO,0/A + , pMAMneo-5, baculov.rus pDSVE, 
and any other vector allowmg express.on of proteins under the d.rect.on of the SV40 
early promoter, SV40 ,a«e promoter, CMV promoter, metaUothionein promoter, munne 

,„ r„,„ sarcoma virus promoter, polyhedrin promoter, or 
mammary tumor virus promoter, Rous sarcoma y 

other promoters shown effective for expression in eukaryonc cells. 

Some express*., systems have markers for seleCon of stabiy transfected 
ceU fines such as neomycin, thym.dine kmase, hygromycn B phosphotransferase, and 
dihydrofolate reductase. High yie,d express.on systems are also surtab.e, such as us ng 
bacu.ov.rus vector in insect cells, with a zmc finger protein encoding sequence under the 
duection ofthe polyhednn promoter or other strong baculovirus promoters. 

The elements that are typically included in express.on vectors also mclude 
a repltcon that functions in E. coU. a gene encc4,ng anfibiotic resistance to perm,, 
action of bacteria that harbor recombinant plasmids, and unique resmcfion s.tes m 
B nonessential regions of me p.asm.d to allow insertion of recombinant sequence, 

Standard transfection methods are used to produce bactenal, mammahan, 
yeas, or insect ceU lines that express large quantities of protein, which are .her . punned 
Ling standard techniques (see. CoUey - a,., J. Biol. CHe m . M4-.I7619-1702 
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(.989)- to Protetn Purtftcaito, in Methods ,„ Enzymoiogy, vo,. 182 (Deutsche, 

d ,990)) TransformaUon of eukaryotic and prokaryot.c ceils are performed according 

■ ro , m iaq 351 H977V Clark-Curtiss & 
,o standard techniques (see. e.g., Momson, J. Baa. 132.349-.5I (1977), i, 

Curtiss, Methods in Etymology 101:347-362 (Wu e, ai. eds, 1983). 

Any of the well known procedures for introducing foreign nucleotide 
sequences into host cells may be used. These include the use of calcium phosphate 
transfection, polybrene, protoplast fusion, electroporat.on, liposomes, — 
naked DNA, p,asm,d vectors, vira, vectors, both episomal and integrate, and any of the 
oter we,, knl methods for introducing Coned genom.c DNA, cDNA, synthetic DNA 
or other foreign geneuc matena, ,nto a host ce„ (see. e.g.. Sambrook ei ai. supra) * » 
on,y necessary that the particular genetic engineenng procedure used be capable of 
successfully introductng a, least one gene into the ho, cell capable of expressing 
protein of choice. 

Vectors encoding zinc finger proteins for relation of gene expression 

Conventional viral and non-viral based gene transfer methods can be used 
,o introduce nucletc acids encoding engineered ,nc finger pro.e.n ,n mammalian cells or 
l et ttssue, Such methods can he used to administer nuCeic acds encoding ^ finger 
proteins to ce„s in «ro or in vivo. Non-vtra, vector dehvery systems include DNA 

a liposome. Viral vector delivery systems include DNA and RNA viruses, which have 
either episomal or integrated genomes after de.ivety to the ce„. For a review of gene 
.herapy procedures, see Anderson, Scienee 256:808-813 (1992); Nabel ft Feign* 
ZeCH ,1:211-217 ,1993); Mitani * Caskey, TIBTECH 1 1 : 1 62- 1 66 (1993); DUlon, 
5 TIBTECH 1 1:167-175 (1993); Miller, Nature 357:455-460 (1992); Van Brunt, 

8 35-36 (1995); Kremer ft Perricaudet, British Medical Bullenn 51(l).31-44 (1995) 

(eds) (1995); and Yu et ai. Gene Therapy 1:13-26 (1994). 
w Methods ofnon-viral dehveiy ofnudeic acids encoding engmeered zinc 

finger proteins include lioofection, microinjection, bto.ist.es, virosomes, Uposomes, 
^liposomes, polycation or Hpid:nuc,eic acid co.ugates, naked DNA, artificial 
virions, and agent-enhanced uptake of DNA. Lioofection is described in e.g., US 



39 



10 



15 



20 



5 049 386 US 4,946,787; and US 4,897,355) and Hpofection reagents are sold 
commerce (e.g., Transfectam™ and UpofecUn™). Clonic and neutral lipid, that are 
suitable for efficient receptor-recogruuon hpofection of polynucleotides include those o 
Feigner, WO 91/17424, WO 91/16024. DeHvery can be to cells (ex vivo adnnmstranon) 

or target tissues (in vivo administration). 

The preparation of lipid ; nucleic acid complexes, including targeted 
Uposomes such as —lipid complexes, ,s we„ Known to one of skill in the - (se, 
. Crystal, Science 270:404-410 (1995); Blaese e, al.. Cancer Gene Tner. 2291-297 
n 995)- Beta e, al, Bioconjugaie am. 5:382-389 (1994); Remy , al.. Unconjugate 
^ 5:647-654 (1994); Gao e, a,.. Gene Tnera P y 2:7,0-722 (1995); Ahmad - .. 
Cancer Res. 52:4817-4820 (1992); U.S. Pat. Nos. 4,186,183, 4,217,344, 4,235,871, 

4 261 975, 4,485,054, 4,501,728, 4,774,085, 4,837,028, and 4,946,787). 
' ' TheuseofRNAorDNAviralbasedsystemsforthedeltveryofnuclcc 

acids encoding entered zrnc finger protetn take advantage of highly evolved processes 
for targeting a v ta s to specfic cells ,n the body and trafficking the viral payload to *e 
nucleus. Viral vectors can be admtntstered d.rectly to subjects (in v,vo) or they can be 
used to treat cells in v,,ro and the mod.fied cells are administered to patients („ v,vo). 
Conventional v,ra, based systems for the delivery of zinc finger proteins could tnclude 
retroviral, lentivirus, adenovual, adeno-associated and herpes simplex vtrus vectors for 
gene transfer. Vira, vectors are currently the most efficient and versattle method o gene 
transfer in targe, cells and tissues. Integration in the host genome is possible w„h fte 
retrovirus, lentils, and adeno-assoca.ed virus gene transfer methods, often renting m 
long term expression of the inserted .ransgene. Addt.ionally, high transduction 
efficiencies have been observed in many different cell types and targe, tissues. 

The tropism of a retrovirus can be altered by incorporating foretgn 
envelope proteins, expanding the potential urge, population of targe. ce„s. Lentivtral 
vectors are retroviral vector tha, are able ,0 transduce or mfec, non-dtviding cells and 
typically produce high viral titers. Selection of a retroviral gene transfer system would 
Terefore depend on the target tissue. Retrovira, vectors are compnsed of ^-acting long 
0 terminal repeats with packaging capacity for up to 6-10 kb of foreign sequence. The 
minimum ^-acting LTRs are sufficien, for replication and packaging of the vectors, 
which are then used to Integra* the therapeutic gene into the target cell to prov.de 
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permanent transgene expression. Widely used retroviral vectors include those based upon 
murine leukemia virus (MuLV), gibbon ape leukemia virus (GaLV), simian immuno- 
deficiency virus (SIV), human immune-deficiency virus (HIV), and combinations thereof 
(see, e.g., Buchscher et al.,J. Virol. 66:273 1-2739 (1992); Johanxi et al, J. Virol. 
5 66-1635-1640 (1992); Sommerfelt et al, Virol. 176:58-59 (1990); Wilson et al., J. Virol. 
63:2374-2378 (1989); Miller et al,J. Virol. 65:2220-2224 (1991); PCT/US94/05700). 

In applications where transient expression of the zinc finger protein is 
preferred, adenoviral based systems are typically used. Adenoviral based vectors are 
capable of very high transduction efficiency in many cell types and do not require cell 
0 division. With such vectors, high titer and levels of expression have been obtained. This 
vector can be produced in large quantities in a relatively simple system. Adeno- 
associated virus ("AAV") vectors are also used to transduce cells with target nucleic 
acids e.g., in the in vitro production of nucleic acids and peptides, and for in vivo and ex 
vivo gene therapy procedures (see, e.g., West et al, Virology 160:38-47 (1987); U.S. 
15 Patent No. 4,797,368; WO 93/24641; Kotin, Human Gene Therapy 5:793-801 (1994); 
Muzyczka, J. Clin. Invest. 94:1351 (1994). Construction of recombinant AAV vectors 
are described in a number of publications, including U.S. Pat. No. 5,173,414; Tratschin et 
al Mol Cell Biol. 5:3251-3260 (1985); Tratschin, et al, Mol. Cell. Biol. 4:2072-2081 
( 19 84); Hermonat & Muzyczka, Proa. Natl Acad. ScL U.S.A. 81:6466-6470 (1984); and 
20 Samulski et al.,J. Virol. 63:03822-3828 (1989). 

Packaging cells are used to form virus particles that are capable of 
infecting a host cell. Such cells include 293 cells, which package adenovirus, and y2 
cells or P A3 17 cells, which package retrovirus. Viral vectors used in gene therapy are 
usually generated by producer cell line that packages a nucleic acid vector into a viral 
25 particle. The vectors typically contain the minimal viral sequences required for 

packaging and subsequent integration into a host, other viral sequences being replaced by 
an expression cassette for the protein to be expressed. The missing viral functions are 
supplied in trans by the packaging cell line. For example, AAV vectors used m gene 
therapy typically only possess ITR sequences from the AAV genome which are required 
30 for packaging and integration into the host genome. Viral DNA is packaged in a cell line, 
which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but 
lacking ITR sequences. The cell line is also infected with adenovirus as a helper. The 
helper virus promotes replication of the AAV vector and expression of AAV genes from 
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the helper plasmid. The helper plasm.d is no, packaged in significant — due ,o a 
lack of ITR sequences. Con.amina.ion with adenovirus can be reduced by, e.g., hea. 
treatment to which adenovirus is more sensitive than AAV. 

In many s.tuations, it .s des.rable that the vector be delivered w.th a h.gh 
degree of spec.ficity to a particular tissue type. A vtra, vector is typ.caiiy 
have spec.fic.ty for a gtven cell type by expressing a l.ga„d as a fuston pro.em wtth a 
viral coat protein on the vtruses outer surface. The l.gand is chosen to have affinrty .for 
receptor known to be present on the ceU type of interest. For example, Ha, * a, Pro, 
No* Acad Sc.. V.S.A. 92:9747-9751 (1995), reported that Moloney murine leukem.a 
v,rus can be modified to express human heregulin fused to gp70, and the — an, 
v.rus infects certatn human breast cancer cells expressing human ep.dermal growth actor 
receptor. Thts principle can be extended to other pairs of vtrus express.ng a hgand fuston 
protem and target eel, expresstng a receptor. For example, nlan.en.ous phage car, be 
engineered ,o display antibody fragments (e.g., FAB or Fv) having spectfic btndtng 
affim.y for virtually any chosen cellular receptor. Al.hough ,he above descrtptton app .es 
pnmanly to vtra, vectors, the same punches can be apphed to nonviral vectors. Such 
vectors 1 be engtneered ,0 contatn specif, uptake sequences .bought to favor uptake by 

specific target cells. 

Expression vectors can be delivered in vivo by adm.ms.rat.on ,o an 
ind.vidua! subject, typical* by systemic administration (e.g., intravenous, .n^pentoneal, 
intramuscular, subdermal, or intracranial infitsion) or .epical appl.ca.ion, as descnbed 
be,ow Alternative*, naked DNA can be adm.nis.ered. Alternatively, vec.ors can be 
delivered ,o cells ex vivo, such as cells exp.an.ed from an ind.vidual subject (e.g., 
,ymphocytes, bone marrow asp,a.es, tissue biopsy) or universal donor hematopo.etic 
ZJs, followed by relation of .he cells .nto apatient, usual.y after se.ect.on 
for cells which have incorporated the vector. 

Administration is by any of the routes normally used for .ntroducmg a 
molecule in.o ultimate contact with blood or tissue ceils. Suitable methods of 
administering such nude.c acids are available and we,, known ,o those of sk.U m d» «. 
0 and, although more than one route can be used to administer a part.cu,ar composition, 
parties route can often provide a more immediate and more effective reaction man 
another route. 
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Pharmaceutical^ acceptable carriers are determined in part by the 
particular composition being admrnistered, as well as by the particular method used to 
administer the composite. Accordingly, there is a wide variety of suitable formulattons 
of pharmaceuttcal compositions of the present invention, as described below (**. e.g.. 
5 Remington 's Pharmaceutical Sciences, 17th ed„ 1989). 

Delivery vehicles for zinc finger proteins 

An important factor in the administration of polypeptide compounds, such 
as the zinc finger proteins, is ensunng that the polypeptide has the ability to traverse the 
,0 plasma membrane of a cell, or the membrane of an intra-cellular compartment such as the 
nucleus. Cellular membranes are composed of lipid-protein bilayers that are freely 
permeable to small, nonionic lipophilic compounds and are inherently impermeable to 
polar compounds, macromolecules, and therapeutic or diagnostic agents. However, 
proteins and other compounds such as liposomes have been described, which have the 
! 5 ability to translocate polypeptides such as zinc finger proteins across a cell membrane. 

For example, "membrane translocation polypeptides" have amptaphihe or 
hydrophobic amino actd subsequences that have the ability to act as membrane- 
translocating earners. In one embodiment, homeodomain proteins have the ability to 
translocate across cell membranes. The shortest intemalrzable peptide of a homeodomain 
20 protein, Antennapedia, was found to be the third helix of the protein, from ammo acd 
position 43 to 58 (see, e.g., Prochiantz, Current Opinion in Neurobiology 6:629-634 
(1996)) Another subsequence, .he h (hydrophobic) domain of signal peptides, was found 
to have similar cell membrane translocation characteristics (see, e.g., Lm e, al. J. Btol. 

Chem. 270:1 4255-14258 (1995)). 

Examples of peptide sequences which can be linked to a z,nc finger 
protem of the invention, for facilitating uptake of zinc finger protein into cells, mclude, 
bu, are no. limited to: an U animo acid peptide of the tat protein of HIV; a 20 residue 
peptide sequence which corresponds .o amino acids 84-103 of me pl6 protein (** 
Fahraeus et al. Current Biology 6:84 (1996)); me third helix of .he 60-amino acd long 
homeodomain of An.ennapedia (Deross, - al., J. Biol. CUem. 269:10444 (1994)); me h 
region of a signal pep.ide such as me Kaposi fibroblast growth factor (K-FGF) h region 
(Lin e, al., supra); or the VP22 translocation domain from HSV (Elliot & O'Hare, Cell . 
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88 ,23-233 (1997)). Other surtable chemical moieties that prov.de enhanced ce.lu,ar 
uotake may also be chemically linked to zinc finger proteins. 

Ten molecu!es also have the abthty to transport polypeptides across ceil 
membranes Often, such mo.ecules are composed of a. leas, two parts (called "b.nary 

lype tide Typical*, the ,rans,oca«,on domatn or polypeptide binds to a ceUuiar 
polypeptide, yp , Sever al bacterial toxms, including 

receptor, and then the toxin is transported mto the 

aim*. P*fi**-> "» — <* hto tMta <DT) ' ~ 7 YA 

periuss.s tox.n (PT), ^ — — «* ^ » 

^ - tt^nts to deliver peptides to the cell cytosol as internal or amino 
Vi^vp been used m attempts to aeuvci 

• , A , flZ r 5,o/ dim, 268:3334-3341 (1993); Perelle al, /„/«* 
terminal fusions (Aionetal.. J. Biol, cnem , 

61:5147-5156 (1993); Stenmark a/., J- Cell Biol. 113:1025-1032 (1991), 
Zeliy et aL. Proc M* Aca, ScL U.S.A. 90:3530-3534 (1993); Caroonetti et a, 
1, L, Meet. A,. So, U.roUol. 95:295 (1995); S*o*aL 
63-3851-3857 (1995); Klimpel et al, Proc. Natl. Acad. Set. U.S.A. 89,0277-10281 

, , / T Rial Chem 267:17186-17193 1992)). 
H 992V and Novak et al., J- Biol. <~nem. ^ 

( Such subsets can be used to translate zrnc f,n g er prote.ns across a 

cel. membrane. zinc finger protems can be convemenUy *sed to or denized w.th such 

Les TypicaUy, the translocation seance ,s provided as par, of a fuston protem. 
OpTonally.aler ca.be used to Imkthezmc finger prote.n and the translocate 
sequence. Any suitable linker can be used, e.g., a peptide hnker. 

The zinc finger protein can a!so be introduced mto „ anrmal cell, 

preferably a mammahan ceH, via a liposomes and hposome derivatives such as 
preferably ^ ^.^ ^ or more 

immunohposomes. The term uposoi Theaaueous 
5 concentncaUy ordered Uptd bilayers, which encapsulate an aqueous phase. The aqueo 
phase typicany contams the compound to be deltvered to the ce„, i.e., a ztnc finger 

^ The Hposome fuses with the p.asma membrane, thereby re.easing the drug 
into the cytoso,. A„=ma,ive,y, ,he liposome is phagocytosed or taken up by the ceU m a 
„ die, Oncein.eendosomeorphagosome.theUposomee.therdegradesor 

fuses w.th the membrane of me transport vesrCe and releases rts contents. 

In current methods of drug delivery via liposomes, the hposome utomate.y 
becomes permeable and re.eases the eneapsu>ated compound (,n Oris case, a zinc finger 
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pr „te,„) a, the targe, tissue or cell. For systemic or .tssue specific dehvery, trus can 
accomplished, for example, in a pass.ve manner wherein the liposome Mayer degrades 
over time through the action of various agents in the body. Alternatively, active drug 
rel ease tnvo.ves us.ng an agent to induce a permeability change in the liposome vesicle. 
5 Liposome membranes can be constructed so that they become destabihzed when the 
envtronment becomes acidic near the liposome membrane (see, e. S ., Prcc. Natl Acai. 
Sci USA 84=7851 (1987); IHoetamlmy 28:908 («»■ When Uposom.s are 
endocytosed by a target cel., for example, they become destabilized and release then 
contents. Th.s destabilization is termed Oogenesis. Dioleoylphosphatidylethanolamine 
10 (DOPE) is the basis of many "fusogenic" systems. 

Such hposomes typically comprise a zinc finger protein and a hptd 
component, e.g., a neutral and/or cat.ontc liptd, optionally mcludmg a receptor- 
recognition mo!ecule such as an anttbody tha, b.nds to a predetermrned cel, surface 

„\ a varierv of methods are available for preparing 
receptor or ligand (e.g., an antigen). A variety ot me 0 ,„ n o sm 

15 liposomes as described in, , S ., Szo k a „ a,.,A»„. Re,. Biop h ys. 

O S. Pat. Nos. 4.1.6.183. 4,217,344. 4.235,871, 4,261,975, 4,485,054 4,0 72^ 

4 774,085, 4,837,028, 4.235,871. 4,261,975, 4,485,054, 4,501,728, 4,774,0 5, 4,837 028, 

L ,- m„ WO 91X17424 Deamer & Bangham, S/ocfom. Bwphys. 
4 946 787 PCT Publication No. WO yiu /^^' + > uc 

Acta 443:«9*34 ,1976); Fraley, « SC. U.S.. 76:3348-33 2 

2 „ (l 979); Hope - aL. BiopHys. ^ <«*>j " ^" 2 42- 

BiopHys. Acta 858,6.-168 (.986); Williams e, a,., Proc. Na„. Aca d . Sc. US. A. 85.242 
246 (.988); Liposomes (Os.ro (ed.), .983, Chapter 1); Hope e, a,., CHe m . Pkys. Up. 
40-89 (1986); Gregoriadis. Uposone Tec hn ol OS y (1984) and Las.c, Liposo.es: fron, 
Pkysics ,o AppUcaUons (1993)). Suitable methods include, for example, somcation, 

25 ex.rus.on htgh pressure/homogenrzation. microfluidization, detergent d,a, y s, S , calcum- 
Lld fUsion of small Hposome vesicles and e,he,fi,ion methods, a,, of which are well 

^ " t certain embodiments of the present invention, it is desirable to ta,ge, the 
hposomes of the invention using targeting moieties that are specific to a particular eel, 
30 type, tissue, and the like. Targeting of liposomes using a variety of targeting mo,e 

Z- Hsands, receptors, and monodona, antibodies) has been previously described (see, 
e.g., U.S. Patent Nos. 4,957,773 and 4,603,044). 
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Examples of targeting moieties include monoclonal antibodies specific to 
antigens associated with neoplasms, such as prostate cancer specific antigen and MAGE. 
Tumors can also be diagnosed by detecting gene products resultmg from the acttvat.on or 
over-expression of oncogenes, such as ras or c-erbB2. In addition, many tumors express 
anttgens normally expressed by fetal tissue, such as the alphafetoprotein (AFP) and 
carcinoembryordc antigen (CEA). Sites of virai infection can be diagnosed us,ng vanous 
VI ral anttgens such as hepatt.is B core and surface anttgens (HBVc, HBVs) hepatms C 
anttgens, Epstein-Barr virus antigens, human tmmunodefictency type-1 vtrus (HIV ) and 
papilloma vims antigens. Inflammation can be detected us,ng molecules specially 
recognized by surface molecules which are expressed a. sites of inflammation such as 
i„,egnns(^.,VCAM-l),selec.in receptors (,,.,ELAM-1) and the like. 

Standard methods for coupling targeting agents to liposomes can be used. 
These methods generally involve incorporation tnto liposomes lipid components, e.g., 
phosphattdylethanolamme, whtch can be acttvated for attachment of targeting agents, or 
denvatized lipophtlic compounds, such as lipid denvattzed b.eomyctn. Anttbody targeted 
Uposomes can be cons^cted using, for instance, liposomes whtch tncorporate protetn 

, ; i mnl Chem 265 16337-16342 (1990) and Leonettt a al, Proc. 

{see Renneisen et al , J. BioL L,nem. 

Natl Acad. Sci. U.S.A. 87:2448-2451 (1990). 

Assays for defining regulation of gene expression by zinc finger proteins 

A variety of assays can be used to determine association of a candidate 
gene with a selected phenotype. The activity of a particular gene regulated by a z,nc 
finger protein canbe assessed using a variety of in -« and in vivo assays, by measuring, 
eg protein or mRNA levels, product levels, enzyme activity, tumor growth; 
transcriptional activation or repression of a reporter gene; second messenger levels (e.g., 
cGMP, CAMP, ff3, DAG, Ca 2 *); cytokine and hormone production levels; and 

,„ „ ft ISA and immunohistochemical 
neovascularization, using, e.g., immunoassays (e.g., ELISA and .mm 

assays with antibodies), hybridization assays (e.g., RNase protection, northerns, » 

hybridization, oligonucleotide array smd.es), colorimetric assays, amplification assays, 

> enzyme activity assays, tumor growth assays, phenotypic assays, cDNA arrays stud.es, 

Zinc finger proteins are often first tested for activity in vitro ustng cultured 
cells, e g., 293 cells, CHO cells, VERO cells, BHK cells, HeLa cells, COS cells, and the 
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like Preferably, human or mouse cells are used. The Zinc finger protein Is often first 
tested using a transient expression system with a reporter gene, and then regulatton of the 
target candidate gene is tested in cells and in animals, both in wo and ex vivo. The ztnc 
finger protein can be recombinant^ expressed in a cell, recombinants expressed ,n cells 
transplanted into an animal, or recombinant expressed in a transgenic animal, as well as 
administered as a protein to an animal or cell using delivery vehicles described below. 
The cells can be immobilrzed, be in solution, be injected into an animal, or be naturally 
occurring in a transgenic or non-transgenic animal. 

Modulation of gene expression and association of the candidate gene with 
j a selected phenotype is tested using one of the in V Uro or ,„ vivo assays described herein. 
Cells or subject animals comprising the candidate genes are contacted with z,nc finger 
protems and compared to control genes or second candidate genes to examine the extent 
of phenotype modulation. For regulation of gene expression, the ztnc finger protein 
optionally has .K, of 200 nM or less, more preferably 100 nM or less, more preferably 

5 50 nM, most preferably 25 nM or less. 

The effects of the zinc finger proteins can be measured by examining any 
of the parameters described above. Any suitable gene expression, phenotypic, or 
physiological change can be used to assess the influence of a z.nc finger protein. When 
the functional consequences are determined using intact cells or animals, one can also 
20 measure a variety of effects such as tumor growth, neovascularization, hormone release, 
transcriptional changes ,0 both known and uncharacterized genetic markers (e.g., northern 
blots or oligonucleotide array studies), changes in cell metabolism such as cell growth or 
pH changes, and changes in intracellular second messengers such as cGMP. 

Examples of assays for a selected phenotype include e.g., transformation 
assays, e.g., changes in proliferation, anchorage dependence, growth factor dependence, 
foci formation, growth in soft agar, tumor proliferation in nude mice, and tumor 
vascularization in nude mice; apop.os,s assays, e.g., DNA laddering and cell death, 
expression of genes involved in apoptosis; signal transduction assays, e.g., changes rn 
intracellular calcium, cAMP, cGMP, IP3. changes in hormone and neurotransmittor 
30 release; receptor assays, e.g., estrogen receptor and cell growth; growth factor assays, 

e g EPO hypoxia and erythrocyte colony forming units assays; enzyme product assays, 
eg.', FAD-2 induced oil desaturation; transcription assays, e.g., reporter gene assays; and 
protein production assays, e.g., VEGF ELISAs. 
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In one embodiment, the assay for the selected phenotype is performed in 
vitro. In one preferred in vitro assay format, zinc finger protein regulation of gene 
expression in cultured cells is examined by determining protein production using an 

ELISA assay. 

5 In another embodiment, zinc finger protein regulation of candtdate gene 

expression is determined * vUn, by measuring the levei of target gene mRNA expression. 
The level of gene expression is measured using amplification, e.g., using PCR, LCR, or 
hybridization assays, e.g., northern hybridization, RNase protection, dot blotting. RNase 
protection is used m one embodiment. The level of protein or mRNA is detected ustng 

,' 0 directly or ,nd,rec,ly labeled detection agents, e.g., fluorescen.lv or radtoactively labeled 
nucleic acids, radioactively or enzymatical.y labeled antibodies, and the like, as desenbed 
herein. 

Alternatively, a reporter gene system can be devised using a target gene 
promoter operably linked to a reporter gene such as luciferase, green fluorescent protem, 
15 CAT or p-gal. The reporter construct is typically co-transfected into a cultured cell. 
After treatment with the ztnc finger protein of choice, the amount of reporter gene 
transcription, translation, or activity is measured according to standard techniques known 

to those of skill in the art. 

Another example of an assay format useful for monitoring zinc finger 

20 protein regulation of candidate gene expression is performed in vivo. This assay is 
parttcularly useful for examrning zinc finger proteins that inhibit expression of tumor 
promoting genes, gene, involved m tumor support, such as neovascularization (e.g 
VEGF) or mat activate tumor suppressor genes such as p53. In this assay, cultured tumor 
cells expressing the zinc finger protein of choice are injected subcutaneously tnto an 
25 immune compromised mouse such as an athymic mouse, an irradta.ed mouse, or a SCID 
mouse After a suitable length of time, preferably 4-8 weeks, tumor growth ts measured, 
e . by volume or by its two largest dimensions, and compared to me control. Tumors 
thathave statistically significant reduction (using, e.g., Student's T test) are said to have 
inhibited growth. Alternatively, the extent of tumor neovascularization can also be 
30 measured. Immunoassays using endothelial eel! specific antibodies are used to statu for 
vascularization of the tumor and the number of vessels in the tumor. Tumors mat have a 
statistically significant reduction in the number of vessels (ustng, e.g., Student's T test) 
are said to have inhibited neovascularization. 
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Transgenic and non-transgenic an.ma!s are also used as an embodiment for 
exammmg reguiat.on of candidate gene express™ /» v,vo. Transgenic an.mals W ,cai,y 
express me zmc fmger pro.e.n of choice. AUemat.veiy, an,ma,s tha, trans.ent.y expre. 
th e zmc finger pro.e.n of cho.ce, or to which the zmc finger protem has been admm stered 
,n a delivery vehrcle, can be used. Reguiat.on of candidate gene expression ,s tested 
using any one of the assays descnhed herem. Anunais can be observed and assayed for 
Jiona, changes, e.g., chaUenged w,.h drugs, mttogens, v,„ses, pathogens, tox.ns, and 
the like. 

Transgenic mice and in v„ro high tbroughput assays for drug discovery 

A further applicatton of the zmc finger pro.e.n technology ,s mampu.attng 
gene expression m ceii fines and rransgen.c ammais. Once a seiected candtdate gene has 
been associated w,tb a phenotype, and the candidate gene has been vai.da.ed as a drug 
J! y target, ceii and transgemc-animai based assays are deveioped for the purposes of 
l*ZJLt drug screemng. A ce.i fine or anima, expressing the can. ate gene , 
pided with a zmc finger protetn that rebates expression of the candtdate gene. The 

although i, can aiso be administered as a protein. Tire cei, fine orammal ,s then contacted 
with test compounds to determine me effect of the compound upon me candtdate gene 
and the selected phenotype. The z.nc finger pro.etn technology is an .mprovement for 
high throughput ceil-based and animal assays, for example, because expresston of the 
zincfrngerproteincanbemadeconditionalusingsmailmoleculesystems 

In one embodiment of a high throughput assay for merapeufcs, zmc finger 
proteins can be used for regulation of candidate genes in celUtnes or ammaU «^ 
mail molecule regulated systems descnbed here.. Express and/or ft. c o o azn 
finger-based repressor can be switched off dunng development and swrtched o at 
Jcells or animals. This approach relies on the addifion of the zmc finger pro*m 
expressing module only; homoiogous recombination , no, quired. Because the ,nc 
finger protein repressors are ,ra»s dominant there is no concern about germhne 
train ssion or homozygosity. These issues dramatically affect the «me and labor 
retired to go from a poorly characterized gene candidate (a cDNA or EST Cone, to 
Zse mode,. This ability can be used to rap.dly idenfily and/or validate gene targets for 
therapeutic indention, generate novel model systems and permit the analys, of 
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complex phys.ologica, phenomena (development, hematopces.s, transformation, neural 
function etc.). CWmenc targeted m,ce can be denved according to Hogan « a,.. 
Manipulating ine Mouse E^o: A Moratory Manual, (1988); Tera^as a* 
E iU L CelU: A PraCca, A PP r^, Robertson, ed., (1987); and Capecch, „ 
al, Science 244:1288 (1989. 

Doses of zinc finger proteins 

The dose adm,n,s,ered to a subject or a cell, m the context of the present 
mention should be sufficient to effect the desued phenotype. Part.cu.ar dosage regions 
can be useful for de—g phenotyp.c changes in an expenmental seHmg, e.g., ,n 
funcuona, genomics studies, and in ceU or amma, models. The dose ,s deternuned by he 

targ e, cell and the conditton of the c=H or paUent, as we,, as the body w.ght o. » 
lithe cel, or pauen, to be treated. The srze of the dose also ,s determine, by the 
ex.stence, nature, and extent of any adverse stde-effects that accompany the 
administrate of a particular compound or vector in a parucular cell or pa en. 

The maxtaum effective dosage of zinc finger pro.em for approxuna.ely 
99% b,nd,n. to target s„es ts calcu.ated to be in the range of lessthan about 1.5x10 to 
S 1 " Is of the specific zinc finger protein molecule per ce«. The number of ztnc 

Katz eds (1976)). As the HeLa nucleus is relative* large, this dosage number » 

doe s no, taKe ,n«o account contention for zmc finger protem 

^ that essentially all of the zinc finger protein is localized to the 
ralculation also assumes that essenuaiiy a 

„!c eus A value of lOOx Kd is used to emulate approximate* 99% bmdmg of to he 

and a value of lOx Kd ,s used to calculate approximately 90% bmdmg of to the 
target site. For this example, K<i - 25 nM 

ZFP + target site complex 
3Q i. e ., DNA + protein DNA:protein complex 

Kd = rjvsJA] [protein! 

[DNA:protein complex] 
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When 50% of ZFP is bound, ■= [protein] 
So when [protein] = 25 nM and the nucleus volume ^ ^ L 
lpr „,e,n] - (25x10-' moles/L) ( lOr« L/nucleus) (6x10° molecules/mole) 
= 1 5,000 molecules/nucleus for 50% binding 
When W% target is bound; lOOx K„ = [protein] 
100xK<i = [proteinl-2.5 uM 

n 5x10-' moles/L) (10-' J L/nucleus) (6x10* molecules/mole) 

1 about 1,500,000 molecules per nucleus for 99% btnding of target s.te. 

The approbate dose of an expression vector encoding a zinc finger 

^„ B *» rt f7incfinRer protein in micrograms is 
TK is used as described above. The dose ot zinc linger p 

13 US ' t the mo iecular weight of the particular zinc finger 

alculated by taking into account the molecular g 



c 



protein being employed. , „f, he zinc fin°er protein to be 

In determining the effecttve amount of the zinc fm„e p 

, ,« „f the zinc finger protein or nucleic acid encoding 
administered, circulating plasma levels of the zinc ting P 

-rial zinc finger protein toxicities, progression of the 
the zinc finger protein, po « ntia 1«U ^ ^ 

phenotype, and the production of anti-zinc ting 

Administration can be accomphshed via single or divided doses. 

Administration of effective amounts ,s y W 

z,nc finger protein into ultimate ically acceptab ,e 

ta „wn to those of skill in the art and a, gh ^ ^ 

, administer a particular composition, a particular 

immediate and more effective reaction than another route. 
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Pharmaceutical* acceptable earners are determined in part by the 

•• u • ^mUtered as well as by the particular method used to 
particular composition being administered, as weu <u> y 

adm.nister the compostuon. Accord.ng.y, .here ts a wde varied of su.table formu.attons 
of pharmaceutical compositions of the present mvention (see. e.g.. Re^.on ; 

5 Pharmaceutical Sciences, 17 th ed. 1985)). 

The zinc finger proteins, nucleic acids encoding the same, alone or ,n 
combination with other suitable components, can he made rnto aerosol formulate (,*. 
they can he WizaT) to be admtmstered v,a tnhalauo. Aerosol formulauons c» he 
p,aced into pressunzed acceptab.e propellants, such as dichlorodifluoromethane, propane, 

1 0 nitrogen, and the like. 

Formulations suitable for parenteral administrate, such as, for example, 
by ,„,_, intramuscular, mtradermal, and subcutaneous routes, mCude a^ueous^d 
non-aaueous, tsotomc sterile mjection solutions, whtch can contam an— ^buffers, 
bac.eriostats, and solutes that render the formulation iso.omc with the blood o f the 

15 mtended recprent, and a,ueous and non-a,ueous stenle suspensions ,h, can mclude 
suspendmg agents, solubrhzers, thickening agents, stabilizers, and preserves. In the 

mfusion, ora„y, topically, intraperitoneally, tntravesicaUy or intrathecal*. The 
formulations of compounds can be presented m unrt-dose or mu.tt-dose sealed 
20 such as ampules and via,, Injection so.uuons and suspensions can be prepared from 
sterile powders, granules, and tablets of the kind previously desenbed. 

All publications and paten, applicattons cited in this specification are 
herein incorporated by reference as if each individua, pubheation or patent apphcatton 
25 were specifically and tndividually ind.cated to be incorporated by reference. 

Although the foregoing invention has been desenbed ,n some detar. by 
way of tllustration and example for purposes of clanty of understanding, it will be 
apparent to one of ordinary ski,, in the a* in light of the teachings of*, mventton *a, 
clin changes and modifications may be made thereto without departrng from the sptn. 
30 or scope of the appended claims. 
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EXAMPLES 

The following examples are provided by way of illustration only and not 
by way of limitation. Those of skill in the art will readily recognize a vanety of 
noncnncal parameters that could be changed or modified to yield essentially similar 

5 results. 

^ .■ T^etin. humm VEfiF^ewjth ^ ^ proteins foq arget^ticn 

A„ important consideration in target validation is to efficiently determine 
and accurately evaluate' the relations!,™ between a targeted gene and resulting phenotype. 
io Th,s example demonstrates the use of the zinc finger protein technology .0 vahdate a 
gene as a targe, for the development of therapeutic compounds that can regulate, e.g. 
express^ of the gene or the function of the gene product. Thrs process is based on the 

following simple assumptions (Figure 1). 

If a gene X is up-regulated by a ZFP-A1, which specifically targets at the 

15 XI site, a phenotype Q is observed. 

If the gene X is up-regulated by ZFP-A2, which specifically targets at a 

different site X2, the same phenotype Q should be observed. 

If the gene X is down-regulated by ZFP-B1, which.targets at the X3 site 
(X3 can be XI or X2), a different phenotype Z should be observed. 

If the ZFP-A1, ZFP-A2, or ZFP-B1 are used to target a gene that is not 
solved in the phenotype Q, no phenotype change related to this gene should be 

The human and mouse vascular endothelial growth factor (VEGF) genes 
were selected for target validation in this example. VEGF is an approximately 46 ^)a 
glyC oprotein that is an endothelial cell-specific mitogen induced by hypoxia. VEGF 
hinds to endothelial cells via interaction with tyrosine kinase receptors Flt-1 (VEGFR-1) 
and Flk-l/KDR (VEGFR-2). Since VEGF plays a very important role in angiogenesis, 
targeting this gene for development of therapeutics has attracted great interest, ^ile 
inhibition (down-regulation) of the VEGF gene may be used for cancer and chabe >c 
retinopathy treatments, activation (up-regulation) of the gene may be used ^«uc 
heart and tissue diseases. These two desired phenotypic changes make the VEGF gene 
ideal for target validation using zinc finger protein technology. 
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Testing zinc finger proteins for biochemical affinity and specificity in vitro 

The DNA target sites for zinc finger proteins were chosen in a region 
surrounding the transcription site of the targeted gene. The primary targets were chosen 
within the region approximately 1 Kb upstream of the transcription initiation site, where a 
j majority of enhancer elements are located. Each 3-fmger zinc finger protein recognizes a 
9-bp DNA sequence. To increase DNA-bmding specificity, two 3-finger zinc finger 
proteins are fused together in order to target two 9-bp DNA sequences that are in a close 
proximity (Liu et aL Proc. Natl. Acad. Sci. U.S.A. 94:5525-5530 (1997)). 

Human SP-1 or murine Zif268 transcription factors were used as a 
0 progenitor molecular for the construction of designed zinc finger proteins. The ammo 
acid sequences (fingers), which recognize the target DNA sequence, were designed based 
on the "recognition rules" described herein. The designed zinc finger protein genes were 
constructed using a PCR-based procedure that utilizes six overlapping oligonucleotides. 
The methods of designing and assembling zinc finger protein genes that target VEGF are 

L5 detailed in USSN 09/229,037. 

The designed zinc finger protein genes were initially cloned into the 
pMAL-KNB vector after digesting with Kpnl and BamHI (Figure 2). The pMAL-KNB 
vector is modified from the P MAL-c2 vector (New England Biolabs, MA). The zmc 
finger protein proteins were purified from bacteria and were subjected to biochemical 

20 affinity and specificity assays. The methods for these in v«ro assays are described herein 
and in co-pending application USSN 09/229,037. 

Activation or repression of a luciferase promoter in transiently transfected cells 

The zinc finger proteins with high biochemical affinity and specificity 
25 were subcloned into the Kpnl and BamHI sites in pcDNA-NVF or pcDNA-NKF (Figure 
2) The pcDNA-NVF construct contains a CMV promoter-controlled sequence encoding 
a nuclear localization signal, a herpes simplex virus VP16 activation domain, and a Flag 
peptide. This construct was designed to up-regulate the targeted gene when introduced 
into mammalian cells. The pcDNA-NKF construct contains the Kruppel-associated box 
30 (KRAB) repression domain instead of VP16 domain and was used for down-regulation of 
the targeted genes. These constructs are described in detail in co-pending application 
USSN 09/229,037. 
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The reporter plasmid system is based on the pGL3-promoter and pGL3- 
contro, vectors (Promega, Wl). Three tandem repeats of the zinc finger pro.etn targe. 
s,tes were inserted upstream of the SV40 promoter (Ftgure 3). The pGLP reported were 
used to evaluate the ac.tvtttes of the engineered zrnc finger proteins for 
gene expression and the pGLC reporters were used to measure the effects of ZFP-KRAB 
activities tnhrbUion of gene express.on. These constructs are described in detail m co- 
pending USSN 09/229,037. 

The control plasmids used in this example are shown ,n Figure 2. pcDNA 
NVF (or pcDNA-NKF) is a ZFP-.ess effector. pcV-RAN (or pcK-RAN) expresses all 
components except that the engineered zinc finger prote.n has no know. DNA b.nduig 
capability (Figure 2). The zinc finger protein sequence in the pcV-RAN (or pcK-RAN) 

^QHiCHXQGCG^ 

DEVGLHKRTHTGEKKFACPECPKKFM^V^HIKTHQ^GGS where he 
^Tare underlined. These control constructs were used to chec, the effects of the 
regulation domains (VP16 or KRAB), in the absence of the DNA binding domain. The 
pc-ZFP-ca. plasmid expresses a specifically designed z.nc finger protetn, however the 
Lcttona. domain (VP16 or KRAB) was replaced with a 234 bp fragment iso late from 
me chloramphenicol ^transferase (CAT) gene ,n the P cDNA3,/CAT vec o nt!442 
, 0 ,677) (invitrogen, CA) (Figure 2). This control plasmid was used to test whether 
DNA binding domain alone has any effects on gene expression. The other contro.s 
ill effectors expressing zinc finger protetns tha, recognize different DNA sequences 
and reporters containing non-specific zinc finger protein target sequences. 

The following example demonstrates the effect of a destgned z.nc finger 
S protein, which activates the luciferase reporter gene in 293 cel,s. The targeted sequence, 
GGGGTTGAG, is named M6-,892S and is in the promoter region of the human VEGF 
gen. The zinc finger protein recognizing thts 9-bp DNA sequence was designed and 

i • tic cm no/? 29 037 The DNA sequence ana me 
assembled as described herein and in USSN 09/229,03 . 

- ~f the- 7inc flneer protein are shown below, 
amino acid sequence of the zinc nngei F iu 
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5 ' GGTACCGGGCAAGAAGAAGCAGCACATCTGCCACATCCAGGGCTGTGGTAAAGTT 
V P G K K K Q H I C H I Q G C G K V 

XACGGCCGCTCCGACAACCTGACCCGCCACCTGCGCTGGCACACCGGCGAGAGGCCT 
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_>j _ t, T R H L R W H 
(Finger 1: GAG) 



(Finger 1: ^uj 

T^CATGTGTACA'TGGTCCTACTGTGGTAAACGCTTCACCAACCGCGACACCCTGGCC 

F M C T W S Y C G K R F T H-B^- P T 1 A 

(Finger 2 : GTT) 

CGCCACAAGCGTACCCACACCGGTGAGAAGAAATTTGCTTGTCCGGAATGTCCGAAG 
j H K R T H T G E K K F A C P E C P K 
CGCTTCATGCGCTCCGACCACCTGTCCAAGCACATCAAGACCCACCAGAACAAGAAG 
R F M R__S_D_H_i-^ H I K T H Q N K K 

(Finger 3 : GGG) 



GGTGGATCC-3 ' 
G G S 
BamHI 



The KpnI-BamHl DNA fragment of the assembled zinc finger protein was 
Coned mto Kpnl-BamHI sites of the pMAL-KNB vecto, The ability of the designed 
zinc finger protems to bind then target sites was verified by expressing and punfymg 
recombinant proteins from E. co« and performing electrophoretic mobility shift assays 
(EMSA) The binding affinity <K<) of the protein shown above was 20 nM, as 
determined by EMSA. This KpnI-BamHf ZFP fragment was then subcloned into Kpnl- 
BamH. sites of the pcDNA-NVF vector and was named pcV-VF471A. The luciferase 
reporter plasmid contaimng three tandem repeats of the M6-1892S sites was made and 

named p GLP-VF471x3. . 

All plastmd DNA was prepared using Qiagen plasm.d purification kits. 
The human embryonic kidney 293 cells were seeded into each well of a 6-well plate w.th 
a density to reach approximately 70% confluence the next day. Cells were co-transfected 
with 50 ng effector DNA (ZFP-expression plasmid), 900 ng reporter DNA an 1 ,00 ng 
pCMV-LaoZ DNA using either Lipofectamine (GBCO-BRL, MD) or GenePORTER 
(Gene Therapy Systems tac, CA) transfection regent. The co-expressed p-ga,ac,os,dase 
active was used a control to normalize the iuciferase activity. Cel. lysates were 
harvested 40 to 48 hours after transfection. Luciferase assays were performed using the 
Dual-Light Luciferase and p-galactosidase Reporter Assay System (Tropix, MA). A 
typical luciferase assay result is shown in Figure 4. 

This example demonstrated that this designed ZFP-expressing plasmid, 
pcV-VF471 A, was able to stimulate the luciferase gene expression by 8 fold when 
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compared with control plasmid pcV-RAN, winch does no, possess to own DNA bmdmg 
capability. When the VP16 domain was replaced with a peptide, which has no 
.rlcnphon regulation act.vity, ,h,s zinc finger protein <pcV-VF47 1 A-ca, lo, . 
activity of trans-activating the lucferase gene. The designed zinc finger protein (pcV- 
VF471A) failed «o activate the lucferase express.cn from the reporter coning a 
different zmc finger protein bmding site, indicating that the trans-activation effect ,s 

the regulation domain (VP16) in thrs examp.e were able to turn on the gene a, an 

appropriate target sites. 

Testing a reporter containing native promoter of tne targeted gene in transient* 

'"""""Ldtfferencehe.eenthes.rnplereporters^andtHenatrvereporter 
sys ,em is tha, the native reporter plasmid construe, contains ,he promoter of the targeted 
!en= A unique advantage for the native reporter system is tha, a single native reporter 
"lid cole, can be used ,o analyze the effects of multiple znic finger pro.ems rn the 

context of the promoter. .„,„ 
The pGLP-na,ive reporter was constructed by replactng the SV40 

promoter ,n pGL3- P romo,er wrth a DNA fragment contatning the ^ ™ 
, ecuences of the targeted gene (Figure 3). m thts example, me nafcve report er — 

. a w PPR amolifvine a 3319-bp fragment from 
of the human VEGF gene was generated by PCR amplifying 

• nNA This fragment contains the VEGF promoter and its flanking 
the human genomic DNA. lhisrragmc Npct PCR is 

regrons The VEGF ATG codon was fused to the luciferase codtng regton. Nest-PCR 
performed for the amplification. The external primers were hVEGFUl 
, t5 GAATTCTGTGCCCTCACTCCCCTGG; nt 1 to 25 based on GenBank sequence 
IvmZ ™ (^ACCGCTTACCTTGGCATGGTGGAGG; nt 3475 to 345!,. 
M63971) and (5 , AC ACACCTTGCTGS21ACCACCATG; nt 

The internal primer pair are hVEHr uz 1.3 

71 to 95 Kpnl site underlined)) and VEGFD1 
• GCAGAAAGT^TTTCGGAGGCC; nt 3413 to 3388, a T to C subs,— 

3 „ le to generate the underlined Nco, sr,e). The nes,ed PCR product was digested with 
plasmid (Ftgure 3). The human VEGF nattve reporter piasmid was named pGLPVFH. 
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A similar strategy was used to amplify a 2070-bp fragment from the mouse 
Genomic DNA. The external primers were mVEGFU2 

(5 ' -TGTTT AG AAG ATGAACCGT AAGCCT; nt 1 to 25 based on GenBank sequence 
U41383) and VEGFD2 (5'-ACCGCTTACCTTGGCATGGTGGAGG; n. 3475 to 3451 
based on M63971). The internal primers were mVEGF 

(5--GCCCCCATTS2tA££CTGGCTTCAGTTCCCTGGCAACA; nt 155 ,0 192; aC to 

T replacement is made to generate the underlined Kpnl site) and VEGFD 
(S'-GCAGAAAGTcCATGGTTTCGGAGGCC; nt 3413 to 3388 based on M63971; a T 
t0 C substitution is made to generate the underlined Nco. site). VEGFD2 and VEGFD 1 
primers were used ,0 amplify both human and mouse genomic DNA since the sequences 
are highly homologous at that region (Shima e, al J. B.o,. a*n. 271:3877 (1996)). The 
murine VEGF native reporter plasmid was called pGLPmVF. 

The following example demonstrates that two designed zinc finger 
proteins were able to up-regula,e the human VEGF native promoter gene in 293 cells. 
One zinc finger protein (pcV-M6-2009A) was designed to target a proximal site 
GAAGGGGGC located at 362-bp upstream of the transcription star, site and the other 
one (pcV-M6.ll 1 S) was designed to targe, a distal site ATGGGGGTG located a. 2240-n, 
upstream of the transcription star, site. Similar to the lucferase reporter assay described 
above 50 to 100 ng of effector DNA are co-,ransfected with 900 ng of native reporter 
DNA and 100 ng of pCMVlacZ DNA. Luciferase activities were measured 
approximately 40 hours post-lransfection and were shown as fold activation in Figure 5. 

Primary zinc finger proteins to activate or repress the endogenous human and 

mouse VEGF genes in cell culture 

To test whether these engineered zinc finger proteins can activate or 
repress the endogenous human and mouse VEGF genes in cell culture, transient 
transfection experiments were conducted. The human 293 cells and mouse mammary 
epithelial cells C127I (Shima,, al. ^C271:3877 (1996)) express low levels of 
endogenous VEGF proteins, which are used to evaluate the zinc finger protein effect on 
VEGF activation. The human gliobhtstoma U87MG cells, to mouse neuroblastoma 
NB41 cells (Levy e, al. Gro^h Facers 2:9 (1989)) and the rat glioma GS-9L cells 
(Conn « al Proc. NatL Acad. Sci. U.S.A. 87: 1323 (1990)) express high levels of 
endogenous VEGF proteins, which are used for testing me repression effects of the z,nc 
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■ t„ „ cells are seeded into each well of a 6-well plate with a density to 
finger pro.e,ns. These cells ar effcctorD NA are usua.ly 

reach approx.mately 70% confluence the next day 0.! to 
US ed to transfect the ce.ls ustng etther Upofectam.ne or GenePORTER transfec 
: g en, depends on the cel, types. App— ,y 14 hours after -fectto^.s 

harvested and endogenous VEGF .evels are measured us,n g the ^USA 

(R&D ~ ^ u ^ ^ M6 2009S zpps _ deslgned as pnmary ,„c 

• • v, VFGF eene regulation. The results in 

=r;::::::::;:ri— -i— — 

endogenous VEGF gene expression in 293 cells. 

• f Human Endogenous VEGF Gene by zinc finger proteins in 293 
Table 1. Activation of Human cnaugcu 

Cells 
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Effector 



Target 



Vector control 



Primary ZFP 



Primary ZFP 



Secondary ZFP 



"pl^RAN [None 



Xtgggggtc 
^v1^2009s | gaagggggc 



pcV-M6-lHS 



Secondary ZFP 



pcV-M6-120S 



pcV-M6-1878S 



GGGGGTGCC 
GAGTGTGTG 
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. D ,s,ance between the target st.es and the VEGF transcriptton initiation site. 
MP: Not found in the vicinity of the VEGF promoter regton. 

To repress the targeted gene, the designed ztnc ftnger ^ 
/ r,M i NKF vector After transfection of the DNA into the 

appropnate cells, the ZFP KRAB P ^ 6 . ns As 

as the cotransfected luciferase reporter gene. The exampl 

shown in Table 1. MM 1 IS ZFP recognizes the targe, sequence 

the M6-1HS ZFP fused ,o KKAB repression domain, an approxtmate.y 80/c repress 
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on the cotransfected luciferase reporter gene expression and approximately 40*/, 
repression on the endogenous VEGF gene expression were achieved. 

„in« to activate or repress the endogenous human and 
Secondary zinc finger proteins to activate or v 

mnnse VEGF genes in cell culture 

To confirm tha, the phystologica, effects observed ustng the pnmary ztnc 
flng er proteins are due ,o the effects on the VEGF gene ana nor orher stde effects such « 
^a on of alternative gene ,ar g e,s, secondary ztnc finger protetns , h ar targe *e VEGF 
g l a, sues dtfferen. than ,ha, of .he pnmary zrnc f,n g er pro.etn were entered. As 

T x h,e 1 the two secondary zinc finger proteins also actrva.e the endogenous 
shown ,n Table 1, the two «oj^ ^ ^ ^ ^ 

VEGF gene expresston ,n cul— . ^ ^ ^ ^ ^ ^ 

protein technology can be used to regulate gen F 
target for therapeutics. 

5 Tertiary zinc finger pro.eins to targe, the genes no, invo.ved in VEGF physiology 

' ^ To confirm that the physical effects observed using ,he pnmary an 

my non-specfic DNA-bindin g or soue,chin g effects, ternary ztnc finger p ote m that 
Tit JL no. mvolved in VEGF phystology are used as negative controls. For 

• a ™^ for reflating human EPO gene expression is 
; „ example, a ztnc fin g er protem ^ ^ « ^ „ by hypoxia ^ mus 
used as a specificity control (see Example II). bFU 

1S usefu! as a control for VEGF targe, validation using a hypoxta assay. VEGF tnlubt.ro 
is use . .. fh „ This result validates VEGF as a molecular 

specifically reverses diabetic retmopathy. Thts result 

target for drug discovery and development. 
" Tes, ,he VEGF inhibition effec, on a diabe,ic re tt nopa,„y mode, in rode»,s 

D.abettc retinopathy ts the most common cause of bltndness amon s, 

30 endogenous VEGF gene expr ^ 

^ veers are constructed as descnbed 

associate virus (AAV) and or reiroviiuo 

.r.« the zinc finger proteins that are fused with the KRAB 
above. These virus vectors express the zinc ringer P 
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repression domam as described above. The vtruses are generated, purified, and .njected 
into the ammals. The efficacy of the engmeered zinc finger protems is evaluated by 
suppresston of rettnal neovascularization as previously described (Admats e, a,., Arch. 
Ophthalmol. 114:66 (1996); Pierce e.ai. Proc. Na„. Acad. Sci. USA. 92:905 (1995); 
Aiello e, al Proc. Na.I. Acad. Sci. U.S.A. 92:10457 (1995); Smith e, al.. In.es,. 
Optimo,. Vis. SC. 35:10!, 1994). All necessary controls, includrng the vral vectors 
expressing the secondary and tertiary zrnc finger proteins are also used. 

Test the VEGF activation effect on a periphera! artery disease mode, in rodents 

Stimulation of peripheral angiogenesis by VEGF to augment collateral 
artery development is a potentially novel form of therapy for patients wtth ischemtc 
vascular d.sease. The same strategy described above is used to valtdate VEGF as a target 
using a mouse peripheral artery dtsease model. The AAV or retrovirus 
express the zinc finger protems fcsed to VP16 activation domam, are constnrcted as 
5 descnbed above. The efficacy of the zinc finger proteins are evaluated snnilar to the 
procedures described previously (Couffinha, e, a,.. Am. J. Pathol. .52,667 (199S); 
Takeshita e, al.. Lab. Invs, 75:487 (1996); Isner e, al.. Human Gene Therapy 
7-959(1996)) All necessary controls, including the viral vectors expressing the 
secondary and tertiary zinc finger pro.ems are also used. VEGF overexpression triggers 
«h This result validates VEGF as a target for drug discovery and 
>0 collateral artery growth, lhis resun vanum 

development. 

TI- F.rvthrnnni^i * Target Discovery 

Mammalian erythropoiests is regulated via stimulation of the erythrotd 
2 5 progenrtors by certain fac,or(s) that provde proliferation and differentiation signa!, 
Hypoxia is a potent signal that induces the expression of genes controlling many 

-V relevantprocesses (Ratctiffe e, al. , Blol. 201:1,3 (199S, One of 
L processes is ,0 "remresf ' that certain tissues release a factor^ for , e pro uctio n 
additional red blood cells. This phenomenon can be detected by stimulating dtffer*,. eel, 
30 lines and/or tissues with hypoxic conditions, sampling the culture supematants, and 

testing for tine stimulation of erythrocyte colony forming urtits from murine bone marrow 
culture, Cell hnes or tissues fourtd to respond to hypoxia in mis way likely express 
erythropoietic growth factors in a hypoxia inducible manner. The analysis of genes 
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dually exposed in such ce.ls or tissues upon hypoxic treatment should lead ,o 
ld entification of erythropoietic grow* factor expressing genes. Zinc finger protem 
technology can be used as analytical tools for such differentia, gene expresston 
moments and ,0 validate the hypothetical erythropotetic growth factor genes. 

A collection of cell types (including human hepatoma cell line, Hep3B) are 
cultured in appropriate medrum and maintained in a humidified 5% C0 2 -95% air 
mcubator a, 370 C . Hypoxic condttions are achieved by flushing 1% 0 2 -5% C0 2 -94% 
N 2 for 18 hours (Goldberg e, ai, Bloci 77.27! (1991)). The cu.ture supernatants are 
harvested and tested in colony forming assay (Muller e.aL E X p. Hematol. 21:1353 

d/ a 1 96 a978V) The human hepatoma Hep3B cell line is 
0 (1993); Eaves & Eaves, Blood 52. liytHiy/Mj. 

found to produce an erythropoietic growth factors) upon hypoxtc mduction (Goldberg e, 
ai Proc. Na,l. Acad. ScL U.S.A. 84:7972 (1987)) and this cell Une is used for further 

characterization. . , 

One working hypothecs b that one (or more) of the cellular genes, whrch 

,5 are responsible for simulating red cell production,, activated upon hypoxra. Thts 

gene(s) may be .dentifted by performing a differential gene expression expenment, such 
as Drfferentia. Display (GeneHunter, TN), PCR-Select cDNA Subtraction (Clontech, 
CA) ormicroarray(Affyme«rix,CA). The gene expression patterns of the RNA 
extracted from the Hep3B cells growing under normal and hypoxic conditions are 

20 compared. ^ ^ ^ ^ ^ ^ ^ ^ ^ jn , he hypoxic cells . 

Approxtmately eighteen genes have been identified as up-regu,a,ed by hypoxta (Ratchtfe 
L J E*p. flW. 201:1153 (.998)). The erythropoietin (EPO) gene and the vascular 
endotheha, grow* factor (VEGF) gene, which have been extensively studied, are used „ 
25 this example to demonstrate the application of the zinc finger protein technology to 
functions genomics and identification of the gene encoding the erythropotetic growth 
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Based on the DNA sequences of the candidate genes identified from the 
above experiments, primary zinc finger protem s are designed to target the DNA 

, teA in a oroximity of the promoters. The zinc finger protein construction 
sequences located in a proximity oiuic F tVip 7inc 

and characters process is the same as that described tn the Example I. The » 
finger proteins (a 3-finger one or a 6-finger protein) with high DNA-bmdmg affintty and 
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specificity are fused with either the HSV VP-16 activation domains or the KRAB 
repression domains to activate or biock expression of the mdividua! genes on the i.s, 

These designed ZFP-VP16 constructs are individually transiently 
transfeced into Hep3B cells using the GenePORTER transfecion reagent (Gene Therapy 
Systems IncCA) under the non-hypoxic condition. 48 hours post-transfectton, the 
supematants are col.ected and the colony forming assays are performed. The gene(s) that 
induces the red ceil production upon zinc finger protein up-reguiation ts considered to be 
the gene(s) that encodes an erylhropoieuc growth factor. The results indicate that the 
erythropoietin (EPO) gene is responsible for the erythropoiesis regu.ation while a„ other 
tested genes (including VEGF) are not. All necessary z,nc finger pro.e.n control 
constructs described in Example I are also used in this example. 

Another way to identify and validate the gene is to perform the similar 
experiments described above except that these zinc finger proteins are teed w„h the 
KRAB domams and the Hep3B cells are stimulated by hypoxia 14 hours post- 
Section. When the zinc finger proteins, which are designed to repress the EPO gene 
expression, are transfeced into the Hep3B cells, no or reduced activity based on the 
colony forming assay is observed. All ztnc finger proteins, which target genes other than 
the EPO gene, do no. affect the red cell production under hypoxic induction. 

To further validate the gene function, secondary zinc finger protetns, 
which targe, a, different si.es of*. EPO gene, are construct. These secondary z,nc 
finger pro.eins, when fused with VP16 activation domains, activate the EPO gene 
expression and stimulate me red cell production. Conversely, when fused with KRAB 
repression domains, these zinc finger proteins inhibit the EPO gene expression under 
hypoxic condition and fail to stimulate the red cell production. 

n,, T r l, in- Brea - Target Gene Pi 5 cpverY 

T^o^th of some breast tumors depends on the continued presence of 

B.tm.en is likely involved in the up-regulation of genes required 
the hormone estrogen. Estrogen is tuceiy mvu 

for maintenance of the transformed phenotype. Cell lines derived from these tissues 
j (such as MCF-7, BT20 and T47D) retain mis dependence on estrogen for growth in 
culture. Thus, i, appears estrogen stimulates expression of essential genes in the 
dependent cel. lines. The discovery of these estrogen-induced genes are usefti. molecular 
.argets for the development of new drugs to treat breast cancer. The use of zinc finger 
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proteins to tdentify estrogen-induced genes required for estrogen-dependent cell growth ,s 
described herein. Furthermore, the newly discovered targets are validated usmg zmc 
finger proteins and appropriate controls. 

Identifying ER-responsive genes 

MCF-7 cells are grown in the absence of estrogen (estradiol) for short term 
(1 week) and long term (28 weeks) to allow transcription of estradiol-induced genes to 
reach basal levels. Cells are propagated in 162 ml flasks, containing Dulbecco's 
Modified Eagle Medium (DMEM), lacking phenol red and supplemented with 10% 
) charcoal-stripped Fetal Calf Serum (PCS) (Hyclone), 10 ug/ml insulin and 0.5 nM 

estradiol. Upon reaching SQo/o confluence cells will trypsinized and transferred to fresh 
medium lacking estradiol. The flasks are incubated at 37°C in a humidified atmosphere of 

5% CO*. . .... 

Estrogen-responsive gene express™ is stimulated by addmg estradtol to 

5 the cells. The cells grown » the absence of estradiol are split into fresh medium lackmg 
estradtol One flask will receive 10 nM estradiol (dissolved in ethanol) while the other 
will receive an equivalent amount of ethanol no. containmg estradiol. Both stimulated 
and unstimulated cells are harvested after 6 hrs. 

RNA is isolated from the cells for identifying differentially expressed 

20 genes using a standard RNA isolation kit. Estrogen response genes are identified u,ng 
one or a combination of the following methods- subtracts hybridization such as PCR- 
Sdect from Clontech. differential display method such as the READS technology 
offered by Genelogic, or Perkin-Elmer's GenScope, cDNA arrays such as GEM 
technology from Incyte, or a high-denstty oligonucleotide matrix technologies offered by 

25 Affymetrix. 

A number of differentially expressed (estradiol activated) genes should be 
identified. The cDNAs for these genes are sequenced and compiled into a list of 
candle genes. It is expected that many genes will be identified, including the estrogen 



receptor. 
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Initial validation of estrogen-responsive genes 

Zinc finger proteins are engineered to target each of the individual 
members of the list of candidate genes, as described above and in co-pending application 
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USSN 09/229,037. The sequences of candidate genes are scanned for unique and easily 
stable 9 bp sequences. This process will include searching databases for matches to 
previously sequenced genes in order to obtain additional sequences and to confirm the 
accuracy of the cDNA sequence generated above. 

These designed zinc finger proteins are fused to functional domains, 
allowing both up regulation and knock-down of expression of the candidate genes, as 
described above. The functional domains to be employed are the Kruppel-associated box 
(KRAB) repression domain and the herpes simplex virus (HSV-1) VP16 activation 
domain. 

Repression of candidate genes 

For repressor studies, cells harbonng the individual zinc finger protons are 
assayed for failure to grow due to blocking estrogen-dependent functions. I. has been 
established that estrogen receptor is essentia, for growth in MCF-7; hence these cells 
should fa,l to grow when the ER gene or other estrogen dependent mnctioos are targeted 

for down regulation. 

Cells are cultured in the medium previously described w„h and wttnou. 
estradiol. Euxaryotic expression vectors, constructed to fuse the zinc finger prote.ns to 
to SV40 HLS and KRAB, are described above. Transfers are done using 
Lipofectamine, a commercially avatlable liposome preparatton from GIBCO-BRL All 
p.asm.dDNAsarepreparedustngQiagenM.diDNApunfioafionsystem. .0 g o dre 

effector p.asmid ts mixed with ,00 ng Lipofectamine (50 pi) in a total volume of 1600 u, 
of Opti-MEM. A pCMV f-gal p.asmid (Promega) will also be included in the DNA 
mixture as an interna, contro, for transfection efficiency. FoUowing a 30 minute 
incubation, 6.4 m, of DMEM is added and the mixture was layered on the cells. After 
five hours, the DNA-Lipofeetamine mixture ,s removed, and fresh culture — 
containing ,0% charcoal-stripped FCS, ,0 ug/ml insulin and ,0 nM estradto, are .ayered 

on the cells. . . , 

Viability .s assayed by trypan blue exclusion and mentoring growth. 

Cells are trypsiruzed, concentrated by centrifitgation and .suspended a. approximately 
,06 c=l,s/m,. A solution of 0.4% ttypan blue is added to an equal volume of ceUs on a 
hemocytometer sfide. Tota, and stained c=„s are counted under a microscope. Growm ,s 
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monitored by _g DNA synthesis. Radioactive ^thymidine (0.5 ,C a, ,0 
Ci/mmo>, Ammersham) ,s added and the ceUs are allowed to grow for an additional h. 

precipitated with ,5% trichloroacetic acid (TCA) and collected by filtration with 

Whatman 3M finer discs and washed with 5% TCA then e,ha„o,. Filters are dned and 

; c miantitated bv liquid scintillation counting, 
thymidine incorporation is quanmareu uy 4 

Activation of candidate genes 

Activation of each member of the list will also be performed to assay for 
estrogen-independent growth of MCF-7 cells. Eukaryo.ic expression vectors are 

commercial avatlable hposome preparatton from GIBCO-BRL. All P las 

prepared using the Q.agen M.di DNA purification system. Transfectton ,s performed 

described above . , r u 

Viability is assayed by trypan blue exclusion and momtonng growth. Cells 

are trypsinized, concentrate by centrifugation and resuspended a, approximately ,0* 
cells/ml. A solution of 0.4% ttypan biue ,s added to an eo.ua, volume of ceils on 
hemocvtometer slide. Tota, and stained ceUs are counted under a microscope. Growth 
monitored by measuring DNA synthesis. Radioactive [3 ^thymidine (0, pCi at 30 
, Ci/mmol; Ammersham) is added and me cells are ahowed to grow for an additional 17 h. 
The medium ,s removed and cells are lysed U, *» with 1% SDS. Cell lysa.es are 
precipitate* w„h 15% trichloroacetic acid (TCA, and collected by filtration with 

ua >h <o/ T r A then ethanol. Filters are dned and 
Whatman 3M filter discs and washed with 5 /„ TCA then etnan 

*■ e ™, Stated bv liquid scintillation counting, 
thymidine incorporation is quantitatea oy uqu 
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Secondary validation 

Additional testing will vahdate candidate genes identified dunng this first 

round of repressor and activator studies. These zinc finger proteins are designed to targe, 

TdiSttncLd separated target sites in - candidate gene. Additionally, the ^ 

and affinity of the zinc finger proteins are tmproved by msing two three finger zinc finger 

protein domains to form a six finger molecule that recognizes 18 bp. 
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Three finger z,nc finger proteins are designed, produced and assayed by 
EMS A as described herein. In order .0 locate suitable sequences, for which zmc finger 
proteins can be easily and reliably destgned, additional sequencing of the candidate genes 
may be reared. Furthermore, additional sequences may be found in nucleotide 
sequence databases. Targe, sequences are chosen so that two 9 bp sequences are w,,h,n 
bp of each other; thus allowing Unking of the zinc finger protein pair, After identifying 
pairs of three finger zmc finger proteins that bind with acceptable affinities and 
specificities, the domains are linked by PGR, amplifying the domain which constitutes 
fingers 4-6 of the six finger molecule. A short DNA sequence encoding a peptide 
sequence predicted to be unstructured and flexible is added to the N-.erminus of this 

domain during amplification. 

Each construct is transiently transfected into MCF-7 cells growing tn 
culture and ,s scored for failure to grow (repression) or estrogen-independent growth 

(activation) as described above. 

Target validation using xenografts 

The effects of altered target gene expression on tumor growth ,s assessed 
by xenografts in nude mice. The genes encoding the zinc finger proteins are cloned ,n,o 
adeno-assocated virus (AAV) or retrovirus-based v,ral vectors as described above. The 
zinc finger proteins are fused to either KRAB or VP16 domains. The resulting 
recombinant viruses are generated, punned and used to infect MCF-7 cells. These 

transgenic cells are introduced subcutaneously into nude mice (Btssery e, al., Sen,,*. 

Oncol. 22:3-16 (1995)). Tumors are measured rw.ee weekly in order ,0 estimate tumor 

weight (Bissery - al. Semi, Oncol. 22:3-16 (1995); (Cubota * al. J. Sur S . Oncol. 

64:1 15-121 (1997)). The experiment is allowed to progress until tumors obtain a weight 

of 100-300 mg or the animals die. 

End-point assays will include macroscopic examination of the thoracc and 

abdominal cavities ,„ determine probable cause of death. Additional assays will include 

histological analysis of tissue samples and excision of tumors for weighing. 

c .„„ r l. TV- F 9 rrv Ar j- "- Y"*"" ""™ VerV ^ . f 

Vegetable oil quality is determined in part by the degree of saturation of 
the component fatty acid side chain, Excessive desaturation (beyond one or two double 
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bonds) leads to poorer quality oils .ha, are more prone to oxidation and rancidity. 
• components of the biosynthefc machinery ,n oil producing seeds detenn.ne the degree of 
desaturat.on. Inhibiting the expression of a gene whose product is involved in fatty ac.d 
desaturarion may lead to htgher quality oils, Zmc finger proteins are used as probes for 
5 differentia, gene expression expenmen, in order to identify genes that play a role ,n 
setting the level of fatty acid saturation. Primary, secondary and ternary zmc finger 
proteins are used to validate the newly dtscovered gene function. Finally, transgenic 
plants, producing higher quality oils, are produced. 

10 Generating candidate genes through random mutagenesis 

Starting material ,s either soybean (Glycine max ) seeds or plants. 
Mutagenesis ts performed by either chemtca. treatment or random DNA insertion 
(Katavic e, a,.. Plan, P^o,. 108:399-409 (1995); Mart.enssen, Proc. 
U.S.A. 95:2021-2026 (1998); Hohn * Puchta, Pro, M Acad Sci. U.S.A. 96:832 1 -8323 

15 (1999)' Facciotti etal.. Nature Biotech. 17:593-597 (1999)). 

Chemical mutagenesis of seeds is performed by soaking in 0.3% (v/v) 
ethylmethanesulfonate (EMS) for 16 h (Haughn & Somerville, Mol Gen. Genet. 
204:430-434 (1986)). M, seeds are propagated and allowed to self-fertilize, then M 2 
seeds are randomly collected and propagated followed by another round of self- 

20 fertilization to form M 3 seeds. The fatty acid composition of the seeds andresulttng 

plants is analyzed as described below. 

Altemativefy, random DNA insertion can be performed by transposmon 
using a number of systems developed in plants (Martienssen, Proc. Nat,. Acad. Sc. 
U.S.A. 95:2021-2026 (1998)). 

Identifying po.en.ia. candidate genes by fatty acid and Upid ana.yses 

Fatty acid and lipid composition is determined for approximately 20-30 
the M 3 seeds according to the method of Katavic (Plant PHysio,. 108:399-409 (1995)). 
Mature p.ant trssues are also similarly analyzed. Seeds are grouped into categories 
30 according to degree of fatty acid saturation. 

Expression profiles are generated for seeds expressing either elevated or 
reduced degrees of desaturation by employing one of*, methods described in Examp,e 
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HI (No«e ; FAD2-l, encoding omega-6.des«ura S e, Is expected to be a gene 
llpressed t„ seeds ,hat w,,l lower .eve, o f po.yunsatnrated ,„ g cha n f a«, y act s, 
Once a par^ar gene has been .denied as participating in the altered phenotype, 
cDNA is selected for sequencing. 

initial target va.idatioa with primary zinc finger proteins 

Zincfingerpro.einsateengineeredto.arge.eachof.he.ndtv.daa! 

members of ,he lis, of candidate genes, as described above and in co-pending apphcatton 
Zl lnV 037. The seances of candidate genes are scanned fot untque and eas„ y 
— This process ,nc,udessearc M ng databases formatchesto 

;e Vl o U s ly sconced genes in order to obtatn addti.onal sequences and to conn™ the 
accuracy ofthecDNA sequence generated above. 

These destgned ztnc finger protetns are fused to functional donrauts, 
ailowtng both up regulatton and knock-down of expression of the candtdate genes, as 
d 1 d ive 1 function, domains to be emp.oyed are the Kruppei—ed box 
^ repression domam and the herpes simpiex vtnrs (HSV-1, VPi6 acttvatton 

d ° mam ' The genes encoding the ZFP-functtonal domain firsions are cioned inio a 
Piant expression vector such as pCAMBIADOi This vector possesses the foilowtng 
, ZZ 0 a selectabie marker such as the gene encoding hygromyctn reststance; 2) 
ri-jT.DN A bordersfor^*n um -media,ed tt ansfo— 

desired promoters (such as CaMV 35S, napin or phaseolin promoters); 4) a plan, 

polyadenylation signal such as Nos; 5) a GUS reporter gene. 

Designed zinc finger proteins are iested for acttvrty agams, , he Rest ed 
.argetbyassayingact.vationorrepressionofreportergenes. A s,n gl e plasm. ^ 
mdependenfy expresses me zinc finger protein and the reporter is u ei The -g« 
seqlce is inserted in the DNA near me star, site for transcription for ,he GUS *** 

a hv r^norter assays are transformed into soybean 
EMSA and in vivo function as assessed by reporter assay 
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■ , v, m hardmentofprol>feratingembryogenic cultures derived 
som a,,c embryos v,a pa.Ce o — < P ^ ^ ^ ^ 

from cotyledons of immature seeds (Liu et ai., 

"'*" „_—.«— »■» — "-■*— "ZZ 

generate more specific zinc finger proteins. 

rnttins to further validate target in desaturation 
Secondary and tertiary zinc finger proteins 

P athWay • pH to validate candidate genes identified during 

Additional testing is used to validate de , iene d 

j- o tv. p 5 p 7inc finger proteins are designeu 

; specificity and affinity of .h zn*f molecule that .cogn.es 10 bp. 

zinc finger protein domams to form by 
Three finger zmc finger proteins are designed, proa 

A in order to locate suitable sequences, for which zinc finger 
EMSA as described herein. - ^ g ^ cffididate gm es 

proteins can be easily and reliably designed, 

:0 mayb ereo,,red. F»^^7^^X-^»^ S 
^databases, ^se^es— ^ * 
hp of each other; thus aUowing lining ~ S ^ ^ 

parrs of three finger zinc «-*er P— J ^ ^ constltutes 
specifies, the domains are unfced , ^ ^ ^ a peptid e 

domainsandlyedLtntobaceocaUusreportersrudiestheninsoybeanplantsas 
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th e second ztnc finger proteins ,s cbose. Again, ,nc finger protetns destgned to 
18 bp are designed and tested as described herein. These zinc finger protetns are 
induced into soybean and the result.ng aiterat.on on fatty acid and l.p.d profdes w,U 

again be examined. 



